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As  a  result  of  growing  demands  for  Automated  Eata 
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Stock  feint  Logistics  Interface  Communications  Environment 
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with  special  focus  ci  distributed  systems  and  attempts  to 
cutlire  the  benefits  for  the  StLlCE  system  from  the  use  of  a 
data  dicticnary/directory  system.  Interface  considerations 
between  data  dictiorary/directory  system  (EDS)  and  neigh¬ 
boring  modules  are  also  discussed. 
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'The  SPLICE  (Stock  feint  Logistics  Integrated 
Communication  Environment)  concept  comes  as  a  result  cf  the 
always  crowing  demands  ox  the  U.S  Navy  for  automated  data 
processing  [Bef.  1]  and  inventory  control  at  various  points. 
A  design  and  implementation  strategy  is  necessary  based  in 
distrituted  architecture  for  a  local  area  network  (LAN). 

SEIICE  is  designed  to  increase  ADP  facilities  cf  the 
existing  Navy  stock  point  and  inventory  control  point. 
Eecause  tie  current  Uniform  Automated  Data  Processing 
System-Stock  Points  cannot  support  the  growing  r eg uire meets 
for  automated  data  processing  (ALP)  without  a  total  rede¬ 
sign,  an  effort  has  ieen  undertaken  to  improve  the  system  in 
the  short  and  long  term  £Ref.  1],  Two  major  objectives  are 
tehirc  tie  SPLICE  development; 

1.  Ic  increase  CE1  display  terminals  so  users  can  access 
interactively  the  system's  cata  base. 

2.  Ic  standardize  the  various  current  interfaces  across 
the  62  supply  sites. 

lie  design  approach  first  starts  from  the  designing  cf 
the  logical  or  virtual  local  Area  Network  (LAN) ,  ty  speci¬ 
fying  all  the  functional  modules,  their  characteristics ,  and 
the  communication  protocols  without  focusing  on  the  hardware 
characteristics.  A  later  pnase  of  the  SPLICE  project  will 
anticipate  the  mapping  of  the  virtual  LAN  reguiremerts  onto 
a  physical  local  network. 


Ihe  following  functional  modules  are  revolved  ir  the 
development  of  the  system. 

-  Iccal  communications  (LC) 

-  Nr  ■. rccal  ccmmu r icatio es  (NC) 

-  frcnt-End  processing  (EEf) 

-  lermiial  management  (1M) 

-  Data  lase  aanageaent  (DEM) 

-  Session  services  (S3) 

-  Eeripheral  management  (EM) 

-  Eescurce  allocation  (EA) 

This  IAN  design  provides  for  distributed  control  tut 
does  ret  provide  for  the  distribution  of  data  bases  within  a 
IAN  -  Ihe  cata  bases  of  the  SEIICE  system  are  geographically 
distributed  over  a  wide  area  and  for  the  purpose  of  main- 
taining  tte  integrity  cf  the  system,  tne  data  base  functions 
are  centralized  within  each  IAN.  A  DBMS  module  fer  the 
system  must  at  least  provide  dictionary,  integrity, 
recovery,  guery  language,  and  security  features  as  well  as 
compatibility  with  existing  CCEOL  programs. 

I  he  functions  of  the  DBM  module  would  be: 

Catalog,  to  maintain  a  catalog  of  file  names  and 
status  (name,  open  or  closed,  size,  physical  address  of 
file , physic al  address  cf  index,  application  used  in,  date 
entered  intc  system,  expiration  date  if  any,  location  cf 
backup  copy,  format,  access  res trictions)  . 

-  Operations,  under  a  menu  selection  scheme  tc  perform 
varices  furctions  (retrieve  and  display  a  record,  update 
specified  fields  of  a  record,  delete  a  record,  insert  a 
record,  print  a  file,  print  a  record  or  specified  fields  or 
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a  record,  arswer  specified  queries  and  display  and  print  the 
results) . 


-  Cicticnary  for  derinin^,  and  cna  rac  ter  izi  n  >3  tne  data 
elemerts.  The  dictionary  must  he  integrated  with  the  IEKS. 
Ihis  will  contribute  to  cata  integrity  ar.d  consistency 
tarctchcut  the  system  and  should  also  be  of  great  assistance 
in  designing  report  formats. 

Kith  this  improved  design  it  is  believed  that  the  SP1ICE 
system  will  provide  economical  and  responsive  support  capa¬ 
bilities  among  the  62  different  geographical  locations,  eacn 
having  a  different  mix  of  application  and  terminal 
requirements. 

Ihe  S1I1C2  functional  design  approach  suggests  devel¬ 
oping  several  functicnal  modules,  distributed  in  minicom¬ 
puters  throughout  the  IAN  with  the  necessary  communications 
to  support  tnem  [Bef.  2].  Ihis  design  provides  for  higher 
system  availability  than  the  centralized  approach  since 
functicnal  modules  can  be  moved  from  one  pnysical  node  to 
another  without  chancing  their  logical  addresses  [Eef.  3]. 
At  the  time  there  exist  no  exact  methods  for  designing 
distributed  systems  ard  so  an  objective  of  the  NPS  research 
program  fer  SPLICE  is  to  advance  knowledge  about  distributed 
systems  atd  to  increase  understanding  of  how  distributed 
systems  must  be  designed  in  order  to  operate  effectively. 

Listrituted  systems  have  problems  associated  with  their 
design  that  need  solutions  in  particular  areas  [Eef.  4  pp 
2].  Ihe  distributed  system  must  provide  the  ability  for  the 
user  tc  communicate  and  access  information  across  the  o2 
local  networks  interconnected  by  the  Defense  Data  Network 
(DDN).  It  must  be  possible  for  the  user  at  Naval  Supply 
Center  (NSC)  Oakland  to  access  the  Inventory  Control  Point 
(ICP )  database  at  Mechanicstur g  in  the  same  way  as  the  local 
catalase  at  Oakland  [Pef.  4]. 
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The  data  dictionary  must  provide  suffort  to  the  arcve  fcy 
uniquely  raming  and  identifying  objects  an  the  overall 
SPLICI  system.  In  the  case  cf  a  message  vnicn  is  destined 
tc  another  local  network,  the  dictionary  can  te  used  to 
cttain  the  physical  destination  address  with  tne  help  of 
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Figcre  1.1  Network  Services  Directory  and  Dictionary. 
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Cession  Services  module  (Figure  1.1)  .  For  oaject  taming 

and  addressing  and  software  maintenance,  tne  data  dictionary 
can  help  ty  storing  all  the  name- to-address  mapping  and 
routing  information.  The  data  dictionary  can  alsc  he  used 
to  specify  task  reguire ments  for  the  user  terminal 
processes.  The  data  dictionary  in  a  distributed  environment 
will  cooperate  closely  with  the  session  services  module 
which  provides  assistance  to  the  user  terminal  processes  in 
carrying  cut  their  tasks.  Thus  a  distributed  operating 
system  must  provide,  in  addition  to  other  functions,  the 
ability  to  access  effectively  the  dicticnar y/directcry 
system  (Figure  1.2  from  Kef .  4)  . 

Major  systems  of  the  SPIICF  application  environment  are 
the  Integrated  Disbursement  and  Accounting  (IDA),  Automated 
Procurement  and  Cate  Entry  (APACE)  ,  Uniform  Automated  Data 
Processing  System-Stock  Points  (UADPS-SP) ,  and  Logistics 
Data  System  Trident  IIS.  Each  of  the  above  systems  has  its 
own  elements,  files,  programs,  transactions,  users  and 
reports  [fief.  4]. 

It  is  vital  for  the  system  to  manage  all  the  resources 
efficiently  and  the  distributed  environment  makes  this  jeo 
more  difficult.  A  data  dictionary/ director y  system  (DCS) 
seems  to  be  one  approach  to  data  design  and  managing  problem 
solution.  For  the  centralized  database  environment  three 
aspects  are  emphasized  [Ref.  5*. 

-The  software  irterfaces  between  tne  D/D  system  ana 
ether  software  packages 

-The  convert  functions  of  the  D/D  system 

-The  environmental  dependency  between  the  D/D  system  and 
a  database  management  system  (C3MS)  . 

For  the  distributed  database  environment,  as  in  the  case 
of  SPIICE,  there  must  be  extensions  to  the  centralized  D/D, 
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Figure  1.2  layered  Operating  System  Design  (Ref-  4). 


addrticral  software  interfaces  regurred. 


and  the  use  cf  the 


E.  CEJECTIVES  OF  THESIS 

lit  SP1ICE  project  at  the  Naval  Postgraduate  School 
(NPS)  takes  the  approach  of  designing  the  logical  or  virtual 
local  Area  Network  (IAN)  first,  specifying  all  the  func¬ 
tional  modules,  their  char acteristics  and  the  communication 
protocols,  rather  than  focusing  on  the  hardware  characteris¬ 
tics  of  IAN  first  £Eief.  1]  developing  alternatives  for 
SPLICE  local  Area  Networks.  After  providing  a  functional 
specification  for  a  distributed  operating  system,  user 
interface  specifications  are  provided,  where  the 
diet icr ary/ cirec tory  system  (DCS)  constitutes  a  major  compo¬ 
nent  £Eef.  h]  and  its  function  is  to  provide  support  fer 
raining  and  identifying  otjects  in  SPLICE. 

lie  cfjectives  cf  this  thesis  are  to  investigate  the 
area  cf  data  dictionary/dir ectcry  systems  £ D D S) ,  to  outline 
the  advantages/ disadvantages  of  these  systems,  and  to 
present  the  underlying  ideas.  Also,  to  pay  special  attention 
to  the  distributed  environment,  and  to  introduce  the 
benefits  for  the  SPLICE  system  from  using  a  dictionary/ 
directory  system.  Finally  an  attempt  will  be  made  to  intro¬ 
duce  the  interface  reguirements  between  a  data 
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A.  GIHEPAI  BE  VI EW 

A  cata  dictionary  is  a  description  cf  data  rs  sources.  It 
contains  icth  machine-readable  and  human-readable  descrip¬ 
tions  of  the  datanase  tables,  their  attributes,  interrela¬ 
tionships,  and  semantics.  It  is  usually  not  very  large,  but 
it  has  a  very  rich  structure.  Most  systems  have  a  data 
dicticDary  facility  which  stores  metadata  about  the  database 
aside  from  the  database  itself.  The  data  dictionary  is 
cften  built  cn  top  of  the  DEMS  as  a  special  application  with 
a  special  data  definition  language. 

Thus  a  CDS  is  a  set  of  one  or  more  databases  containing 
data  abcut  an  organization' s  information  resources.  Incse 
resources  can  be  retrieved  and  analyzed  using  standard  data¬ 
base  management  system  (DBMS)  capabilities.  The  concept  cf 
a  data  dictionary  system  has  existed  m  the  data  processing 
industry  for  a  number  of  years.  Use  of  sucn  a  system 

consists,  basically,  cf  an  attempt  to  capture  and  store  in  a 

central  location  definitions  cf  data  and  other  entries  cf 
interest  [fief.  6].  The  principles  of  sucn  a  system  are: 

-Provide  for  better  data  control 

-Ercvide  for  better  documentation 

-Improve  the  quality  of  the  systems  that  are  tuilt  in 
terms  of  user  functionality  and  satisfaction  and  system 
maintainability. 

The  data  dictionary  helps  to  capture  and  document  data 
elements,  their  definitions  and  some  of  their  descriptive 
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attributes.  It  alsc  provides  icr  logical  grouping  cf  data 
elements  airing  the  ;toce££  cf  gathering  reguiremen ts  to 
build  a  tew  system.  The  data  element  dictionary  provides  the 
vocatulary  that  can  he  used  between  tne  systems  analyst  and 
the  erc-user  [fief.  6*. 

Nest  in  the  spectrum  of  usage  tne  DD5  help  is  twofold. 
First  if  the  data  dictionary  is  availaole  it  can  he  extended 
to  include  infornaticr  cf  hew  and  by  whom  tne  data  elements 
can  he  used.  Thus  a  dictionary  can  be  used  to  store  the 
defiritiens  of  data  elements  and  tne  definitions  cf  etier 
data  constructs  (records,  files),  tne  definitions  cf 
processes  (programs  cr  manual  processes),  and  definitions  cf 
data  users  (individuals,  organizations).  Ihe  Second  trend 
that  contributed  to  this  extended  usage  of  a  dictionary 
system  was  the  gradual  migration  away  from  tne  use  cf  tradi¬ 
tional  files  toward  the  concept  of  a  central,  integrated 
catalase  distributed  across  the  DDN  but  centralized  within 
each  IAN,  under  the  control  of  a  database  management  system. 

Ihe  prctlem  cf  duplication  cf  data  (data  redundancy)  can 
he  solved  inside  each  IAN  tut  another  mecnanism  must  be 
provided  in  order  tc  solve  that  prcnlem  across  the  DIN. 
This  pichlem  must  be  examined  carefully  and  that  mechanism 
must  provide  for  economy  because  sometimes  data  redundancy 
may  he  mere  cost-efficient  than  tne  freguent  use  of  ICN. 

Ihe  aheve  is  vital  for  system  design  because  in  the 
SPLICE  environment,  data  are  to  be  shared  not  only  by 
different  systems,  tut  alsc  hy  a  wide  range  cf  users.  Ihe 
tasic  concept  of  a  IEMS  is  tc  provide  a  centrally  located 
set  cf  definitiens  cf  data  within  each  LAN  mat  is  to  he 
shared  in  ender  to  assure  that  different  users  will  access 
commcr  data  with  a  set  of  consistent  definitiens. 

Ihe  ICS  acts  as  a  repository  of  ail  definitive  informa¬ 
tion  about  the  database  such  as  cnaracteristics,  relation¬ 
ships,  and  access  authorizations.  These  databases,  as 
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implied  by  the  term  'logically1,  can  he  physically  stored  in 
diverse  locations  within  each  LAN  but  are  logically  linked 
via  ccamur ica tions  anc  the  EDS. 

Ihe  data  dictionary  system  located  rn  a  node  within  each 
IAN  can  be  used  to  provide  the  amove  definitions  and  trus 
the  reguirec  data  consistency. 

Separating  the  data  dictionary  from  the  database  raises 
two  problems  [Ref.  7[. 

-Ihe  dictionary  and  data  base  may  disagree  with  cne 
another  tr. less  ote  interface  has  control  of  noth  functions 

-having  a  separate  data  dictionary  implies  having  a 
separate  language  for  the  definition  and  manipulation  cf  the 
dictionary  database. 

isers  who  define  tables  and  other  objects  (case  of 
system-E)  are  encouraged  to  include  English  text  to  describe 
the  meanings  of  the  objects.  later  otner  users  can  retrieve 
attribute  tables  with  certain  attributes  or  can  brcwse  amcng 
the  descriptions  of  defined  tables,  if  they  are  so  autncr- 
ized.  A  user  later  can  modify  these  entries  to  change  the 
attributes  cf  an  object. 

E.  M  A  SAGEHENT  Of  III CBHATICN  BESOOBCES 

Information  cesourse  management  (IRM)  is  a  methcdclcg;, 
that  attempts  to  solve  a  set  of  problems  related  tc  the 
system  life  cycle  ir  an  integrated  and  coordinated  manner. 
Ihe  data  dictionary  system  will  play  an  important  role  in 
this  a re  a. 

Ir  the  case  of  SE1ICE  the  CDS  can  play  an  important  rcle 
in  providing  a  documented  inventory  of  information 
resources,  a  ccntrcl  mechanism  for  the  analysis  and  design 
or  new  information  resources  and  tne  necessary  resource 
inde  p  en  dence . 
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A  cats  dictionary  can  oe  used  as  a  powerful  tool  (net  as 
a  solution)  that  can  aid  in  the  solution  to  various  problems 
such  as  the  inventory  control/  report  production,  proper 
routing  of  cata,  prefer  routing  of  reguests,  data  ccrsis- 
tencj,  security,  etc. 

finally  the  dictionary  system  project  is  m  iact  an 
Infornaticn  Resourse  tanagement  (IRM)  *  project.  Ihe  SPLICR 
system  possesses  much  valuable  data  that  has  teen  geneiatea, 
collected,  and  stored  in  an  automatic  and  ’formated'  state. 
Ctili2aticn  of  any  class  of  data  involves  one  cr  more 
processes.  These  are  [fief.  6] 

-  Collection:  It  is  a  process  that  tends  to  he  expen¬ 
sive  as  the  cost  of  identification  and  recording  (including 
input  to  an  automated  system,  as  necessary)  can  be  high. 

-  f i ccessing :  Tie  data  collected  is  generally  ’managed' 
in  seme  fashion  befcie  and/or  after  being  stored.  Ir  the 
case  of  automated  cata,  this  occurs  through  tne  use  of 
computer  pregrams. 

Storage:  The  repository  of  data  and  information 
termed  a  "data  base". 

"  Retrieval:  Using  the  knowledge  about  the  storage 
technigue  being  used,  data  are  retrieved  to  answer  questions 
cr  to  he  modified. 

-  Communications :  A  communication  line  is  needed  to 
connect  the  user  terminal  witn  tne  place  where  the 
dictionary  resides. 


information  Rescurse  Management  is  whatever  policy, 
action,  cr  procedure  concerning  information  (both  automated 
and  nen-au tema ted)  which  management  establishes  that  serves 
tne  overall  current  and  future  needs  of  the  system.  Sucn 
policies,  etc.  would  include  considerations  of  a v ai lahi li ty , 
timeliness,  accuracy,  integrity,  privacy,  security,  audit- 
ability,  ownership,  use,  and  cost  ef r ectiveness  [Ref.  6'. 


I  he  env  lronu cut  it  which  the  above  processes  take  place 
is  composed  of  : 

-  Cata  and  inf  or  nation .  Represents  the  core  cf  tie 
entire  information  processing  spectrum. 

-  The  users  in  the  s yst em.  It  is  the  personnel  involved 
with  the  system.  These  are  users  of  data  and  other  inferaa- 
tion  components. 

facilities.  Computer  hardware  and  ether 
physical  devices  used  in  data  processing. 

-  Processing  facilities.  These  are  all  the  activities 
which  take  place  in  the  use  of  physical  facilities. 

Support  facilities.  All  the  services  which  are 
required  ty  users  cf  cata  as  well  as  personnel  whose  respon¬ 
sibilities  are  primarily  in  the  information  systems  area. 

Each  cf  the  atove  components  is  refered  as  an 
Infer  aa  tier  Resource  and  the  ccmputer  system  must  provide 
for  an  integrated  ard  coordinated  manner  to  manage  the 
entire  irfermatien  resource  of  the  SPLICE  system  and  the 
data  dictionary  has  to  play  a  major  role  in  conjunction  with 
the  database  management  module. 

C.  SCPPCRT  OF  S 1ST E i  LIFE  CYCIE 

Ir  this  section,  we  present  some  bignlights  cf  hew  the 
cata  dictionary  supports  the  main  steps  of  system 
development. 

The  waterfall  model  cf  tne  sertware  life  cycle  [fief.  14] 
consists  cf  the  following  stages:  system  ftasihility, 
requirement p  specification,  product  design,  detail  design, 
coding,  integration,  imr lem ent a t i cn ,  operations  and  mainte¬ 
nance.  Cf  course  there  are  also  otner  modeis  of  a  software 
life  cycle  tut  basically  the  functions  of  a  DCS  are  the  same 
in  whatever  model  we  consider. 


Luring  the  system's  reasihility  stage  tne  CCS  car.  re 
cscu  let  r. th  data  element  collection  and  to  avoia  recurdan- 


cies  arc  ircocnstctcies. 


Also  tne  DCS  car 


ccrta  in 


description  cf  tioce£ses  that  art  already  available  aid  to 
help  in  assessing  the  true  magnitude  of  the  yroyose j  task. 
Cunnc  the  r  eg;  ui  re  m  ents  specification  stage,  the  cata 
dictionary  can  provide  tne  neaos  to  detect  existing  inaccu¬ 
racies  ir  definitions  and  tc  correct  teem  before  tne  system 
operation.  Tais  is  because  the  CDS  contains  the  overall 
scope  of  the  requirements  tc  be  specified. 

Curing  the  product  design  and  detail  design  stages,  the 
CDS  can  help  because  it  contains  the  design  details  cf  both 
data  and  ptccesses,  which  can  be  snared  by  all  renters  cf 
the  design  team.  Paiticularily  m  database  design  the  CDS 
can  record  iruitiple  tser  views,  pass  output  iron  the  logical 
design  phase  to  physical  design  phase,  generate  nultiple 
designs  for  benchmark  testing,  and  verify  the  existing 
conversions  of  data  in  the  system.  For  the  rest  cf  the 
stages  the  DDS  can  help  in  data  collection,  coding,  and 
testing,  by  providing  any  desired  degree  of  coordination  and 
control  ever  tasks,  generating  data  structures,  storing 
instructions  for  the  staff,  describing  the  various  gets  and 
activities,  and  finally,  providing  a  means  for  effective  and 
consistert  modification  of  the  system. 

Additional  benefits  that  can  be  derived  from  the  IDS 
£Bef.  6]  are  naming  standards,  aid  to  auditing,  interfaces 
tc  application  program  development  tools,  and  software 
configuration  management.  A  CDS  allows  a  system  tc  be 
extended  treugn  the  addition  cf  new  entity  types,  relation¬ 
ship  types,  attribute  types,  and  also  can  be  usee  tc  3da 
conricuraticn  entity  types  such  as  reguirements  specifica¬ 
tions,  change  notices,  etc.  ■ The  major  advantage  from  the 
use  cf  the  CDS  is  in  the  case  cf  an  active  system  where  the 
system  net  cnly  records  the  entities,  tut  also  controls  how 
they  are  revised. 


I. 


EA1A  EICTIONABY  SYSTEH  CEG A  NI2AII0N 


The  or g aniza  ti on al  structure  for  a  DDS  that  is  to  be 
adopted  2 u s t  be  comitnsurate  with  the  size  cf  the  activity 
at  any  ere  time.  Such  a  structure  is  displayed  id  Figure 


Figure  2.1  SPLICE  Data  Admin.  Function  Organization. 


2.  1  . 

Ite  Data  Administrator  is  tre  person  responsible  rcr 
articulating  tne  data  policy  after  the  major  guidelines  nave 
teen  laic  down  by  the  designing  team.  That  policy  includes 
planning  for  data  collection,  its  structuring,  its  storage, 
and  its  quality  control.  For  tne  SPLICE  system  tr.e  Data 
Administrator  can  be  a  person  or  a  team  located  in  any 
place,  whose  main  function  will  te  the  setting  or  tne  above 
policy. 


Tie  Dictionary  Acminis  tratcr  who  tne  -ers  cr.  c:  team 
respcrsihle  ior  the  diet  icr.ary  system  within  the  Data 
Administrator  tunc  tier  (eg.  reccrdir  g  ci  all  met  a-  inter  ta- 
ticn  ard  utta-data  ar.d  its  maintenance  througr  the  use  c: 
the  dictionary  system,  a long  witn  making  its  facilities 
availaile  tc  the  users  of  this  system).  Secause  ir  the 
SPLICE  system  the  data  dictionary  is  umgue  tnrougn  all  the 
system  ard  no  different  views  of  the  data  dictionary  are 
permitted  in  the  various  locations,  that  team  or  person  must 
he  unicue  through  the  system.  Cnly  that  team  (cr  person) 
must  have  the  priviledge  tc  maintain  the  DO.  The  Database 
Administrator  who  the  terson  (cr  team)  responsible  for  the 
technical  aspects  of  obtaining,  running  and  maintaining  the 
DBMS.  Since  SPLICE  is  a  distributed  system  with  catalases 
distributed  across  different  locations,  tne  Database 
Administrator  does  net  need  to  be  unigue.  The  reguired 
policy  and  definitions  are  setup  by  the  data  dictionary 
administrator  and  this  is  enough  to  maintain  consistency 
through  the  whole  system.  The  Data  £uali.ty  Insp  ecticn  team 
has  a  role  also  in  the  hierarchy,  and  its  functicr.  is  the 
guality  inspection  of  the  information  or  data,  and  tne 


guality  audit  trail  cl  the  whole  system.  This  can  be  one  or 
more  teams.  In  the  case  of  several  teams  the  entire  audit 
effort  can  he  divided  among  them. 


E.  CONCEPTS  ON  COS  SELECTION  AND  EVALUATION 

It  is  very  difficult  to  find  a  commercialy  available  EDS 
to  meet  exactly  the  reguirements  of  a  system  under  develop¬ 
ment.  A  selection  and  evaluation  process  composed  of 
various  steps  must  he  developed  in  order  to  select  the  best 
system. 

Ecur  steps  are  proposed  by  £Ref.  6]  ior  the  process  of 
selecticr  ar.d  evaluation  of  a  EDS: 


-Determine  the  ie  quire  men  t  s  for  the  dictionary  system. 
These  snculd  be  classified  as  either  neing  mandatory  cr  net. 
If  net  mandatory  establish  a  scale  and  assign  numbers  indi¬ 
cating  tie  importance. 

-Develop  a  list  cf  features  of  dictionary  systems  that 
will  be  used  in  the  evaluation  cf  systems. 

-Determine  a  mapping  from  the  needs  onto  these  features. 

-for  each  mapping,  using  descriptions  of  available 
systems,  a  system  can  be  found  either  to  duaiify  cr  net. 
This  process  leads  to  eliminate  systems  that  are  net 
quali f y . 

fce  cannot  say  that  the  above  procedure  is  perfect  ar.d 
does  ret  have  a  tisk  for  mistakes.  Decause  it  is  subjective 
and  variously  depends  on  the  experience  and  smartness  cf  the 
sele  c  t  icc/e  val  ua  tion  team.  Seme  more  common/general  reasons 
leading  tc  mistakes  are:  Ihe  needs  were  never  properly 
assessed,  and  potential  users  were  not  asked  the  right  ques¬ 
tions,  unnecessary  but  apparently  "nice"  features  were  given 
high  values,  the  evaluation  cf  the  system  was  inconsistent 
because  different  people  evaluate  different  systems  without 
a  well-defined  measunement  method,  undue  emphasis  was  placed 
cn  features  that  will  be  needed  in  the  future  but  unimpor¬ 
tant  new,  etc. 

Fcr  the  SPLICE  system  we  cannot  follow  the  above  proce¬ 
dure.  SPLICE  has  decided  tc  use  Tandem  as  their  "front  end" 
miniccmpute r.  As  a  result,  selecting  a  DD S  is  largely  a 
foregone  conclusion  ir  this  situation.  So  we  rave  tc  use 
Tandem  DEMS  and  the  associated  dictionary  capabilities. 

f.  ADIITIC  NAL  ASPECTS  CF  DCS 

In  tbe  next  few  years,  several  extensions  to  dictionary 
systems,  net  available  today,  will  most  likely  be  commer¬ 
cially  available.  These  additions  will  allow  dictionaries 


te  it  icrt  tzfeiitivt  in  interfacing  with  the  inner  oaticr 
resources.  .  n  e  use  ci  e  x  tensj.nj.iit/  raciiities  d  i  i  c  w  s  a  ^ . 
installation  t j  customize  the  dictionary  system  in  crier  to 
lake  it  exitctive  in  sue  a  applications.  5  xch  t'xdi.it;  die 
the  use  cr  EDS  tc  ccr.troi  the  total  in  for  nation  resource,  to 
ail  ic  the  analysis,  desigr  and  development  of  information 
systens,  arc  to  aid  ir  efficiert  database  design.  ILe  last 
applicaticr  example  is  the  use  of  DD5  as  a  repository  of 
information  for  an  entire  system.  This  is  exactly  the  mayor 
role  the  EES  has  to  play  in  the  SPLICE  system. 

Serening  to  the  SPLICE  application  environment  the  EDS 
would  require  users  ard  analysts  to  derine  tiie  system  data 
eiemerts,  files,  etc.  wx.ich  would  entail  updating  cla  den- 
nticrs,  discarding  outdated  ones,  and  introducing  new  ones. 
In  this  way  standards  cr  data  definition  and  description  rcr 
application  programs  can  te  established  over  the  entire 
SPLICE  system  £ fief .  1].  cut  on  tne  otner  hand  it  is  a 
herculean  task  tc  retrofit  a  dictionary  to  existing  applica¬ 
tion  systems.  Eecause  of  tne  many  amove  mentioned  difficu- 
lies  in  implementing  the  dictionary  to  old  application 
systems,  we  recommend  as  muon  sere  preferable  to  lsplement  a 
dictionary  for  new  applications  only.  That  means  that  tne 
dictionary  will  he  developed  gradualy  and  a  long  period  will 
he  needed  to  be  fully  implemented  for  the  wncle  SPLICE 
system. 

Although  DDSs  have  many  advantages,  tneir  disadvantages 
should  te  mentioned  as  well.  Eictionary  systems  are  complex 
sortware  systems  and  the  execution  of  many  dictionary  runc- 
tions  may  consume  a  significant  part  of  the  system 
resources.  As  the  scote  cr  the  dictionary  is  enlarged  to 
induce  always  larger  r.umner  ci  information  re  sources,  the 
EjS  will  te3in  gradually  tc  loon  iixe  the  major  resource 
consumer,  and  thus  the  main  user  of  tne  host  computer  system 
[fief.  61.  When  we  consider  active  interlaces  of  tne  EES, 


the  previous  problem  tecoies  more  serious.  If  the  IDS 
controls  a  process  trrough  one  cl  these  active  interfaces, 
it  fellows  that  this  process  cannot  proceed  until  such  time 
as  the  dictionary  system  has  finished  its  job.  This  delay 
time  is  added  to  the  whole  process  time.  Given  that  there 
can  he  many  processes,  the  continuous  use  of  tne  DCS  anc  tne 
accuaulated  service  time  may  eventually  result  in  a 
bottleneck. 

lte  proposed  solution  fcr  the  SPLICE  system  [Bef.  4‘  can 
avoid  (or  at  least  reduce)  this  overhead  by  locating  one 
copy  cf  the  CDS  in  each  LAN.  fiith  this  simple  and  efficient 
technique  each  user  located  in  any  cf  the  o 2  stock  and 
inventory  control  pcints  only  needs  to  consult  the  local 
IDS.  3h€  number  of  users  who  needs  the  DCS  services  remains 
the  same  tut  the  overhead  from  the  long  gueuing  time  across 
the  ID N  will  be  recused  ty  a  factor  close  to  62.  By 
locating  the  master  copy  cf  the  DDS  in  one  place  we  can 
solve  the  maintenance  problem  cf  the  DDS,  because  additions, 
deletions  and  updates  of  the  DDS  can  te  done  only  via  the 
master  ccpy  by  the  Dictionary  Administrator.  Ail  the  ether 
copies  can  be  updated  only  remotely  by  the  master  ccpy 
through  tne  DDN ,  ir  such  a  way  as  to  represent  the  exact 
image  cf  the  master  copy.  Eecause  cnanges  in  definitions 
(deletions,  updates,  additions)  are  not  freguent,  we  esti¬ 
mate  tnat  the  whole  process  of  updating  the  local  copies  cf 
tne  CDS  will  not  be  expensive,  and  the  resultant  overhead 
will  net  te  significant.  Cf  course  this  assumes  all  62 
IAN'S  are  working  off  the  same  schema,  and  tne  application 
envirenment  is  homogereous  across  the  network. 

G.  FJEBIBCEY  OF  DDS 

A  good  hierarchical  DDS  structure  is  significant  if  we 


want  tc  avoid  the  " t c ttlene ck"  mentioned  arove. 


A  structure 


is  f icfc££c  in  Figure  2.2  and  we  ttiieve  that  it  is  les^ 
expensive  ir.  consumizc  the  system  resources  than  the  struc¬ 
ture  cf  having  different  views  cr  tn  e  naster  dictionary  at 
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A  First  EES  Hierarchical  Structure  for  SFIICE. 


each  IAN.  In  particular  suppose  the  copies  of  the  lccal 
dictionaries  are  not  exact  images  of  the  master  dictionary, 
tut  are  different  views  of  the  master,  especial!)  views 
containing  information  cniy  for  the  local  database.  In  aiict 
a  case  it  is  not  useltl  to  separate  the  definitions  frcn  the 
actual  catalase  since  the  different  views  of  the  whole 


database  are  centralized  withir  each  LAN.  If  a  spare  part 
for  example  cannot  be  found  in  a  local  database,  teen  the 
user  has  tc  consult  the  master  dictionary  to  find  the  loca¬ 
tion  cf  the  requested  spare  part  recause  the  local  copy  of 
the  cata  dictionary  aces  not  certain  information  about  ether 
data  bases  of  the  system.  In  this  case  the  user  has  to 
access  the  EDN  twice,  first  tc  consult  tne  master  dictionary 
and  then  tc  consult  the  local  database  in  which  the  spare 
part  is  located.  Ibis  procedure  can  easily  lead  tc  long 
waiting  tines  and  finally  tc  "bottler ecx "  because  the  Daster 
dictionary  will  have  to  answer  in  guestions  coming  from  62 
different  IAN'S.  A  second  hierarchical  structure  is  shewn 
in  figure  2.3  .  This  structure  involves  the  location  of  a 
copy  cf  ICS  in  selected  nodes  instead  of  each  node.  Ey  this 
way  we  reduse  the  amount  of  secondary  memory  needed  tc  store 
the  E/D  tut  we  increase  the  use  of  DDN.  This  increase  in 
use  cf  ED N  is  inversely  proportional  to  the  number  cf  E/D 
replicated  copies.  He  soluticn  cf  locating  exact  copies  cf 
the  aaster  dictionary  in  each  or  selected  LAN's  ha's  tne 
disadvantage  of  consuming  more  secondary  storage  but  cur 
estimation  is  that  this  is  preferable  and  less  expersive 
than  the  freguent  use  of  DDK  in  order  to  consult  the  master 
copy. 

S»e  cannot  say  that  distribution  instead  of  replication 
cf  DCS  is  an  inefficient  method  not  acceptable  for  SPLICE. 
Since  there  is  not  enough  experience  for  distributed 
systems,  and  especially  for  data  dictionaries,  we  nave  to 
examine  carefully  every  possible  architecture,  the  pres  and 
the  ccns  cf  each  one,  in  order  to  maxe  the  best  decision. 
Eut  still  we  believe  that  the  decision  will  be  based  mere  on 
estimations  comming  from  intuition  and  less  in  experience 
and  statistical  information .  Such  an  architecture  is  based 
cn  distribution  instead  of  replication  of  D/D  for  SPLICE. 
Inis  is  shewn  in  figtre  2.4,  and  will  be  examined  ir  a  next 
chapter . 
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figure  2.4  A  Third  IES  Hierarchical  Structure  for  SFIJCE. 


G.  F  EA1CKE  ANALYSIS  CF  DDS 


Ic  this  section  the  features  of 
analysis  of  them  will  he  presented, 
theoretical  approach  and  aces  not 


DDS  and  a  more  detailed 
This  presentaticE  is  a 
concern  any  particular 


3  C 


system.  A  cost/benefit  analysis  can  tell  us  wnr cn  features 
need  to  te  included  in  a  DDS  under  development.  It  is  :crt 
preferable  approach  than  to  develop  a  DDS  as  describee  below 
using  the  Tandem  DBMS  capability. 

1 •  Arc  hitecture  and  Im  pie  aenta  tion 

The  relaticrship  between  DDS  and  DBMS  will  u 
addressed  here.  The  purpose  of  a  DBMS  is  to  manage  data  and 
the  purpose  of  DDS  is  to  manage  meta-data.2  The  guesticn  is 
wnether  the  DDS  must  be  a  free-standing3  or  D3MS -dependent4 
system  £  Bef  .  6  ]. 

The  free-star  ding  approach  is  ^oed  for  commercial 
systems  because  each  enterprise  can  evaluate  the  pres  and 
cons  and  reach  the  optimal  decisions  whetner  to  huy  or  net. 
This  approach  raises  compatibility  problems  between  the  EDS 
and  the  DEMS,  especially  when  tne  vendors  are  different 
companies.  There  are  many  factors  we  have  to  taKe  irto 
acccurt  when  deciding  whether  a  DDS  must  be  free-star dirg  or 
EBMS-dependent .  These  factors  include  the  method  of  lm^ie- 
mentaticr,  the  scope  cf  usage,  whetner  the  DDS  and  EfiMS  are 
going  to  te  developed  together  or  not,  and  whetner  they  are 
going  to  te  supplied  by  the  same  vendor  or  not. 

Cne  other  feature  of  DDS  architectural  structure  is 
whether  the  DDS  should  te  passive  or  active.  Suppose  there 
is  a  compiler,  application  program,  or  ether  process  tnat 
reguiles  meta-data,  fer  its  execution.  There  should  te  DDS 
available  which  produces  automatically  tne  repaired  meta¬ 
data.  This  f uncticrality  is  referred  to  as  dictionary 
interface  ard  can  operate  in  two  modes:  Passive  where  there 


2K€ta-cata  is  the  data  that  describes  data 

3 A  dictionary  system  which  does  not  use  a  DBMS  ir  its 
imple  mentation 

4A  dictionary  system  which  dees  use  a  DBMS  in  its  impie- 
menta  tier 


Active  where  the  accve 


exists  a r  cation  cf  whether  the  process  will  retrieve  the 
required  neta-data  (through  tne  dictionary  interface  cr  free 
elsewhere)  or,  in  the  case  where  the  process  already 
contains  the  meta-data,  there  exists  an  cption  for  the 
systen  tc  check  whether  this  meta-data  is  tne  most  current 
version  in  the  dictionary.  here  the  dictionary  is  net  in 
the  critical  path  of  a  process.  Active  where  the  accve 
eptiers  dc  not  exist  and  the  process  always  uses  the  rest 
current  neta-data  in  the  dictionary.  The  dictionary  cere  is 
in  the  critical  path  cf  tne  process  and  the  process  must  90 
through  the  dictionary  fer  the  neta-data  in  order  tc  execute 
properly . 

A  ITS  can  contain  both  kinds  of  interfaces.  Me  nave 
to  keep  in  aind  that  the  interfaces  of  the  DOS  system  do  rot 
only  c cncern  the  DDE  itself,  tut  also  other  modules  with 
which  the  dictionary  has  tc  cooperate  in  order  tc  aairtain 
the  whole  system. 

2.  lexical  Schema,  Ent  it  y  T_yy;es,  Rela  tionshi  ps 


Eictionary  schema  is  the  term  denoting  the  logical 
structure  cf  a  dicticnary.  Structural  characte ristics  and 
contents  cf  the  dictionary  schema  determine  the  kinds  of 
neta-data  and  the  relationships  to  he  established  among 
them.  Using  the  enti ty-relation snip-at trib ute  aocel 
£fief.  6]  fer  the  dictionary,  we  define  entities  as  real 
world  objects  or  things  about  which  information  exists  in 
the  dictionary,  at  t  rib ute  s  as  properties  (guantities  or 
qualities)  cf  the  entities,  and  r elationsnips  as  connections 
between  entities. 

In  the  DES,  resources  suen  as  data,  nardware,  soft¬ 
ware,  tra nsacti ens,  personnel  and  documents  may  he  repre¬ 
sented,  ard  entities,  attributes,  and  relation  snips 
associated  witn  these  resources  must  also  be  represented. 
Tables  1  through  V  at  tne  end  cf  this  chapter  taker  rica 


£Eef.  4]  indicate  possible  data  element  attributes,  rile 
entity  attributes,  hardware  entities  and  a  t  tricut  es  ,  scr  t  war 
entities  and  attributes,  and  dccuaen t/report  atcriLUtes  for 
the  SE1ICE  system. 

Similar  entities  in  a  DCS  establisn  entity  types. 
Attributes  can  also  have  a  degree  or  similarity  and  in  teas 
case  we  speak  abort  attribute  types.  Finally  similar 
considerat iens  apply  to  re lationsnirs  and  so  we  have  rela¬ 
tionship  types,  that  are  relationships  between  entity  types. 

Schema  descriptor :  In  a  dictionary  schema 

containing  ail  existing  entity-types,  relationship-types, 
and  a t t r i b u te- t y pes,  ary  one  cf  tnem  can  be  referred  tc  as  a 
schema  descriptor.  Information  existing  in  the  schema  can 
indicate  which  entity-types  are  members  or  a  given 
relaticnsiip-type,  ard  whicn  at tribu te- types  are  associated 
with  an  entity-type  cr  r ela tienship- type . 

Entity- t ypes  oi  a  CDS  can  he  classified  as  data 
entity-types,  process  entity-types  and  usage  entity-types. 
Cn  the  ctkei  hand  attribute  types  can  be  descriptions,  clas¬ 
sification  and  audit  attributes  created  by  the  dictionary  to 
indicate  identification  of  the  person  whe  created  the 
entity,  cate  of  entity  creation,  identification  cf  the 
perscr  whe  last  modified  the  entity,  date  of  latest  modifi¬ 
cation,  and  total  number  of  modifications  of  the  entity 
£Bef.  6].  These  capabilities  are  very  useful  for  a  system, 
especially  cne  as  complex  as  SELICZ.  Using  the  above  capa¬ 
bilities  reports  and  summaries  can  be  presented  on  reguest, 
and  alsc  we  can  have  a  trace  cf  various  interactions  cn  the 
system  using  application  programs  for  this  reason. 

3.  Interfaces  and  Commands 


carried  cut  through  tie  Sessicr  Services  module.  This  is  a 
separate  tcpic  which  will  re  examined  separated)1.  In 
general  an  interlace  can  he  as  shown  in  larle  VI. 

Cr  the  other  hand  commands  can  he  classified,  cr.  the 
basis  cr  their  functionality,  into  various  categories  as 
shown  in  Table  711. 

A  dictionary  system  can  be  regarded  as  a  software 
product  that  helps  ir  storing  information  about  data  trat 
already  exists  in  databases.  Both  DDS  and  DBMS  deal  with 
descriptions  and  characteristics  of  data  elements  and  with 
the  logical  structures  obtained  from  these  elements  and 
their  r e lationships.  A  closely  integrated  dictionar)  system 
and  automated  database  design  process  have  much  tc  offer. 
The  interfaces  between  a  dictionary  and  a  database  design 
process  can  be  divided  into  two  broad  categories: 

-Initial  data  entry  and  editing 
-logical  model  structuring 

In i tial  data  entry  and  editing:  For  data  entry  the 
data  requirements  information  needed  by  automated  database 
design  procedures  is  almost  a  complete  (proper)  subset  of 
the  information  normally  stored  in  current  commercial 
dictionary  systems.  For  the  SP LICE  t ne  files  already  exist 
tut  the  dictionary  dees  not.  Therefore  the  whole  design  of 
CDS  must  provide  fer  initial  detection  and  avoidance  of 
duplicate  ectries.  As  soon  as  th*  design  takes  care  cf  teat 
during  the  initial  steps,  then  the  entry  cf  information 
about  raw  data  elements  has  to  be  made  only  tc  tne 
dicticrary  system.  Fext  an  interface  must  exist  in  order  to 
allow  the  design  procedures  tc  access  information  in  named 
aggregations  (local  views).  For  editing,  the  initial  data 
entry  is  rarely  clean  in  the  sense  that  names,  usage,  ard 
char  ac  t  eris  tics  cf  the  data  elements  may  not  pat  he  stan¬ 
dardized  across  local  views.  Synonyms,  nomonyms  ard  ltccn- 
sistert  characteristics  of  the  same  data  usually  result  when 


Ihfc 


cata  reguirements  are  gathered  from  different  sources, 
editing  phases  of  the  automated  design  procedures,  arc  the 
reports  produced  therein,  can  serve  as  an  input  filtering 
function  for  the  dictionary.  Wnen  the  interactive  editing 
phases  are  completed,  obsolete  information  (eg.  ron-star.dard 
names)  can  he  removed  from  the  dictionary,  such  tnat  the 
inf cnaticn  remaining  permanently  is  clean  and  consistent. 
Again,  as  he  mentioned  in  a  previous  section,  this  can  he 
done  only  for  new  applications  because  the  tasi  of  retro- 
fiting  a  dictionary  to  existing  application  systeas  is  very 
cif  f  i  c  u  1 1 . 

logica 1  model  struc  tur ing ;  The  structuring  proce¬ 
dure  fcr  initial  design  srould  he  able  to  extract  filtered, 
unstructured  data  element  information  in  named  aggregates 
(local  views)  from  the  dictionary  such  that  the  composite 
model  and  the  derived  logical  designs  can  he  generated  in 
the  rcraal  aanner. 

for  adding  new  reguireaents  to  existing  designs  and 
when  processing  new  functions  or  adding  new  data  tc  an 
existing  database,  the  design  process  should  he  able  to 
extract  frcm  the  dictionary  a  description  of  the  existing 
design  alccg  with  the  filtered  unstructured  data  element 
infcraaticr  for  that  which  is  new.  Various  levels  of 
constraints  on  the  freedom  of  structuring  processes  can  be 
set  here  in  order  tc  facilitate  the  wnoie  design  effcrt. 

Cnee  the  autcaated  design  process  rs  completed  and  a 
suitable  logical  design  has  teen  obtained,  the  results  must 
he  stored  in  the  dictionary.  Assuming  the  unstructured  data 
elemerts  are  already  described  in  the  dictionary,  the  rela¬ 
tionships  defining  segments,  databases,  logical  relations 
and  secondary  indexes  would  new  be  stored. 


T  AEI  £  I 

Data  Elemect  Attributes 

ly^e 
Barge 
I e  rgt  h 

Ur  it  of  measure 
Us  age 

larguage  naaes 
Be  teti tions 
8  8  Levels 
Key 

Default  value 
Display  foraat 


TAELE  II 

File  Entity  Attributes 


File  race 
locations 
Si2e  (ir  bytes) 

Fermat  {seq,  racdoc,  tir) 

Access  control 

Access  security  protection 


TA  EIE  III 

Selected  Hardware  Entities  and  Attribute 


Er tities 

Piccessinc;  system 
Secondary  storage 
Cc  amunicaticrs  system 
Ccncen  trators 
Terminals 

I A E  I/O  peripherals 

attributes 

lype 

Model 

Model  number 
Serial  cumber 
M £ cer ' s  number 
Sc  tree 
features 
L  e  scr  ip  tron 
Eccu-  references 
Osaje  cy  site 
Ccst 

Maintenance  activity 
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TAEIE  IV 

Selected  Software  Entities  and  Attributes 

E  r.  ti  ties 

C  p  e rating  system 
C ft rational  support  system 
Z  r.  vir  on  men  t  ad  system 
Application  software 

Attributes 

Pr  cpram-id 

Sevisicn  nucber 

Eevisior  date 

Cate  compiled 

Iype  of  compiler 

Patch  level 

Change  level 

license 

Cate  released 

Product  number 

Sc  urce 

F e  atur es 

Dec umenta tier 

Csage 

Cost 

Maintenance  activity 


TAEIE  V 

Docuient/Heport  Attributes 


a  ne 

N u  ater 

frcduct  numler 
Helease  date 
Eevisicn  number 
Source 
Fe  ature 
Description 
Quantity 
Cost 


TAEIF  ¥ I 

Kinds  of  CDS  Interfaces 

Ccmnand  language 

Screen  crierted  interface 

Fixed  format  batch  data  entry  facility 

Program natic  interface  that  allows  user  written 
applications  programs  to  access  the  dictionary 


TAEIE  VII 

Comnand  Categories  for  DDS 

Dictionary  aaintenance 
Eejort  and  cuery 
Data  structue  interface 
Extensitili  ty 
Status  related 
S  e  c  ur  i  t  y 

Dictionary  yrccessing  ccntrol 
Dictionary  administrator 


III.  IMJGBATION 


A.  HE  JBCELEM 

At  active5  data  dictionary  is  desiranie  for  the  SPLICE 
system.  It  is  alsc  known  £Ee£.  3]  that  most  i ictic t an ies 
fail  tc  meet  this  objective.  A  prerequisite  t o  an  active 
dictionary  is  a  high  degree  ox  interaction  between  the 
dictionary  and  various  other  software  elements  such  as  tne 
EBiiS  itself,  hut  also  including  query  languages,  nepcrt 
generators,  application  development  aids,  and  the  like.  An 
architecture  for  a  centered  and  highly  integrated  EES  taken 
from  [Bet.  E]  is  shewn  in  Figure  3.1  . 

Ihe  existing  dictionaries  today  are  noticeably  urinte- 
grated,  anc  hence  Jess  than  active.  Such  a  situation  is 
shown  in  Figure  3.2  (taken  from  £Hef.  8]  )  concerning  tne 

IBU  EE/EC  data  dictionary  and  related  software.  Notice,  in 
particular,  that  wtereas  scae  hatch  feeding  of  data  is 
provided  tc  and/cr  ficm  the  dictionary,  there  are  no  fewer 
than  six  places  where  database  definition  data  is  stereo  (in 
addition  tc  data  definitions  included  in  actual  programs) 
£Ref-  8j.  Ihese  are  : 

Ihe  IE/EC  dictionary  itself 
Ihe  I E  E/PS  B  libraries 
Ihe  CCECI  copy  library 
Ihe  catalase  design  aid  (EtDA) 

Ihe  GIS  data  definition  tables 

Ihe  application  development  facility  (ADF) ,  segment 
rules  in  an  IhS/EC  environment,  or  in 


^Active  t<?  some  degree  oecause  if  it  is  too  active  we 
can  lccse  efficiency 


development  management  system  p AS)  r lies  in 
a  CICS  environment. 


There  is  nc  guarantee  that  eacr  or  these  descriptions 
hill  agree  at  any  point  in  time.  Other  data  d ic t  ic r.  ar its 
may  have  a  higner  decree  or  integration  nut  no  one  is  close 


is  ir  t fc €  database,  cr  a  Dill  routine  wmcn  wants  tc  edit  a 
field  flier  to  updating  the  database,  or  the  database  access 
system  which  needs  tc  know  if  a  user  password  is  valid  for 
updating  a  certain  record.  All  tne  above  functions  recuire 
direct  access  to  the  data  dictionary. 

The  extent  to  which  a  IIS  qualifies  as  being  "inte¬ 
grated1'  is  a  relative  notion  determined  by  the  sccpe  of  its 
metadata  anc  the  way  that  it  interfaces  with  ether  software. 
The  most  ccimon  use  cf  the  term  "integrated"  is  with  refer¬ 
ence  tc  a  I/D  that  is  the  sole  source  of  metadata  in  the 
system.  He  integrated  D/I  is  accessed  for  all  references 
to  meta  data.  Most  cf  the  commercially  available  IIS  have 
reached  a  high  degree  of  integration  with  their  environ¬ 
ments,  and  this  results  in  multiple  sources  of  descriptors 
within  the  systems.  Ihe  DDS  permits  these  systems  tc  access 
the  I/I  indirectly  and  convert  the  metadata  of  each  system 
to  the  format  reguired  by  the  D/D  £ Eef .  5].  So  for  example 
a  DDS  might  communicate  with  a  compiler  in  either  cf  two 
ways : 

-Ey  generating  file  and  record  definitions 
that  the  compiler  accepts  via  copy  statements. 

-Ey  reading  source  programs  and  creating 
transactions  to  load  the  IDS  with  descriptions 
cf  files,  records,  and  elements. 

Cne  additional  area  which  demands  in vestigaticn  fer  the 
development  of  a  succesfui  IDS  concerns  integrating  schemas 
which  describe  the  logical  structures  of  all  data  types 
existing  in  a  distributed  (like  the  SPLICE)  database.  This 
feature  permits  the  determination  of  a  data  file's  lcgical 
structure  as  well  as  its  identity  and  location,  and  could 
possihly  be  essential  to  the  development  of  guery  and  data 
model  translation  shemes.  The  existence  of  a  master  schema 
also  permits  the  lcgical  relation  or  data  across  file 
boundaries;  then  all  files  in  the  network  car  be  considered 
as  areas  within  a  sircle  large  database  £Bef.  9]. 
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Figure  3.2  IEJi  Data  flaragement  Architecture. 
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E.  3J1EGEAH0N  CF  DIE 

llree  aspects  of  integrated  DDS  in  tte  cent  caliztc  and 
distributed  database  tivir  ccs€it  fox;  SPLICE6  are  cr  srt3t 
interest  anc  oust  be  emphasized  £Ref.  5]. 

-Ihe  software  interfaces 

-lhe  ccrvert  functions 

-lie  environmental  dependency  between  the  DDS  anc  tne 

IEMS 

A  DIE  is  integrated  with  other  sort  wane  packages  by 
facilities  that: 

-Allow  direct  and  indirect  access  to  tne  C/D 

-Automatically  capture  the  metadata  used  by  ctier 
systems 

In  the  next  three  subsections  we  will  examine  tie  three 
most  interesting  aspects  of  an  integrated  DDS. 

1 .  Eof  tware  Interfaces 

A  software  interface  permits  another  system  to 
access  the  C/D  either  statically  or  dynamically.  First  we 
consider  the  static  interf ace ,  which  links  the  D/C  with 
another  system  indirectly  via  the  extraction  of  a  file  of 
formatted  metadata.  For  the  static  interface  of  a  DDE  and  a 
IBMS,  fcr  example,  the  data  dictionary  adm inistratcr, 
following  the  specif icati cns  of  the  data  administrator, 
enters  intc  the  CDS  all  pertinent  transactions  to  define  the 
database  ar.d  the  database  administrator  using  the  abcve 
definitions  describes  th<.  database.  After  reviewing  the 


6Cur  approach  fcr  the  SPLICE  database  and  data 
dictionary  cistributicn  is  hybrid.  SPLICE  is  a  distributed 
system.  cut  the  databases  are  centralized  within  each  IAN. 
Also  tte  dictionary  copies  at  each  of  tne  selected  LAN's  are 
exact  copies  cf  tte  master  dictionary  and  different 
dictionary  views  are  not  permitted.  So  the  whole  SPLICE 
system  can  be  viewed  as  a  distributed  system,  but  concerning 
eacn  particular  IAN,  the  database  ana  data  dictionary  can  be 
said  tc  fellow  the  centralized  datanase  environment  concept. 
Eo  both  ideas  cf  certraiizea  and  distributed  envirenaerts 
can  be  applied  tc  the  SPLICE  with  slight  modifications. 


accuracy  cf  this  database  d  escr  ip  tion ,  a  coxaand  is  gener- 
ated  for  CCS  that  uses  this  description  to  produce  a  rile 
containing  the  DDL.  The  EEBS's  DDL  processor  ther  tnans- 
lates  this  generated  DDL  intc  a  schema  file  tnat  the  run 
time  unit  of  the  DBhS  can  access.  No  run-time  correction 
between  the  CDS  and  the  EEKS  exists  here;  the  CEME's 
processor  is  not  executing  during  the  DDS's  DDL-gener at  ion 
process . 

Static  irterfaces  differ  somewaat,  depending  upon 
whether  they  interface  the  EDS  witn  user-written  programs  or 
with  vender-supplied  software  packages.  Static  irterfaces 
for  programs  writter  in  languages  such  as  COBOL  and  PI/I 
produce  file,  record,  and  datahase  descriptions  for  the  user 
programs  frem  the  data  dictionary  [Ref.  5].  These  inter¬ 
faces  sometimes  feature  edit  capabilities,  format  options, 
and  various  other  functions  to  maxe  the  interface  mere  flex¬ 
ible.  Edit  capabilities  may  include  being  able  to  add 
prefixes  and  suffixes  and  even  to  replace  entire  rames. 
format  eptiens  may  ccrtrcl  indentation,  level-numter  incre¬ 
ments,  seguence  numbers,  and  line  identifiers.  Inclusion  of 
various  clauses  suet  as  comments,  condition  names,  and 
initial  values  also  may  he  allowed. 

Static  interfaces  for  software  packages,  such  as  IDL 
processors,  communication  monitors,  and  guery  processors, 
produce  formatted  statements  for  those  packages  or  create 
specially  encoded  ccntrcl  files  for  their  use. 

Static  interfaces  are  prevalent  because  of  tneir 
utility,  capability,  and  efficiency.  With  powerful  static 
interfaces,  the  data  administrator  can  guickly  charge 
formatted  metadata  or  create  new  formatted  definitions  from 
existing  D/E  entities.  The  static  D/C  can  he  made  compat¬ 
ible  with  many  versions  of  other  software  packages  and  can 
he  developed  independently  cf  the  source  code  of  particular 
software  packages.  A  disadvantage  to  the  user  of  a  static 


interlace  is  tne  eatra  effort  that  may  he 
generate  and  catalog  aetadata  for  the  D/D. 


Mere  s. 


tte  static  interface  itself  nas 


rc  capaiilities  for  updating  the  aietadata  cf  the  systems 
nth  khich  it  interfaces.  Without  adequate  synchronisation 
and  controls,  the  aetadata  in  the  DD5  and  the  metadata  in 
ether  systeas  may  beccaie  inconsistent  j_Hef.  5  j. 

Eyrauic  interfaces  provide  direct  access  ty  the  IDS 
to  ether  software  modules.  This  direct  access  is  commonly 
achieved  via  hign-levei  interface  commands  that  shield  the 
softkare  package  frea  the  physical  details  of  the  D/E.  The 
commands  activate  standard  IDS  functions,  sc  as  to  select 
ail  entity  cccurrences  that  satisfy  a  particular  ccrditicn. 
A  DDS  cat  provide  a  facility  that  maxes  commands  available 
tnrough  call  statements;  any  program  can  then  access  the  D/D 
kithout  knowledge  of  its  physical  structure.  Dynamic  inter¬ 
faces  ^rcvice  consistency  control  and  capabilities  for  teth 
update  and  retrieval.  Cnarges  to  the  D/D  are  automatically 
reflected  in  the  next  execution  of  any  software  packages  to 
wcich  tte  C/D  is  interfaced;  no  intervening  procedures  are 
required  as  with  static  interfaces.  A  software  package  can 
directly  retrieve  and  update  aetadata  stored  in  the  D/D  if 
the  user  has  the  authority  to  do  so,  and  the  software 
package  has  a  such  capability.  Otherwise  tne  software 
pacKace  anc  tne  user  would  only  nave  read  authority  to  the 
E/D . 

here  is  wnete  special  attention  must  be  given  when 
designing  a  DDS  for  the  SPLICE.  iie  sard  previously,  ween  we 
described  the  first  and  the  second  uierarchical  structure 
for  S  El ICE ,  that  the  local  copies  of  the  SPnlCC  DDS  will  he 
exact  images  of  the  master  copy.  Witn  this  approach  one  can 
imagine  khat  will  happen  if  one  program  in  any  or  the  62 
LAN's  attempts  to  update  the  metadata  stored  m  the  DDS. 
Ine  whole  consistency  of  the  system  is  gone.  The  local 


copies  uli  no  longer  le  exact  images  or  the  laster  copy  and 
many  problems  can  arise.  The  only  solution  fcr  the  proposed 
architecture  for  SPLICE  DOS  is  that  reguests  for  update, 
deletion,  or  addition  of  data  definitions  must  be  routed  via 
tne  TIN  tc  the  node  where  the  master  copy  of  the  IDS 
resides.  Then  the  data  dictionary  administrator,  wnc  is  the 
cniy  person  responsible  for  IIS  maintenance,  can  aprcve  and 
make  the  reguested  changes  in  the  master  copy.  These 
changes  must  then  be  transmitted  to  the  various  lccaticns 
where  copies  of  C/D  reside  and  executed.  This  we  believe  is 
the  crly  procedure  urcer  the  proposed  DDS  architecture  which 
can  maintain  consistency  over  the  whole  SPLICE  system.  fie 
caanct  say  that  this  kind  of  operation  is  purely  dynamic, 
tut  neither  is  it  static.  fie  might  call  it  is  a  hybrid 
interface  function  wherein  the  security  and  validity  checks 
cf  the  CCS  are  always  applied. 

The  use  of  dynamic  interfaces  incurs  sigrificant 
overhead  due  to  the  size  and  complex  structure  cf  DCS. 
Application  development  support  aids,  sucn  as  preprocessors, 
source  pregram  managers,  ard  design  aids  generally  can 
afford  this  overhead  because  response  time  is  not  critical. 
Cn  the  ether  nand,  efficiency  is  critical  for  transacticn- 
processirg  systems  that  reference  the  D/D. 

To  reduce  the  potential  overhead,  common  queues  may 
he  precompiled  and  stored  in  the  D/D.  Another  technicue 
used  tc  reduce  overhead  is  for  the  software  package  to 
retrieve  all  the  metadata  reguired  for  a  transaction  at 
cr.ce;  thus  future  accesses  for  this  transaction  only  irvclve 
memory  lockup.  Table  VIII  frern  [Hef.  5]  shows  some  typical 
types  cf  software  packages  interfaces  for  DDS. 

2 .  Convert  Functions 

In  addition  tc  software  interfaces  the  integration 
cf  a  DIS  into  its  environment  is  provided  tv  ccrvert 


functions.  A  IDS  organization  has  a  _ot  or  £rojr  ais,  report 
and  files  tc  manage.  I  he  data/data  dictionary  a  i  am  istiator 
aust  encode  thousands  cf  maintenance  transact  ions  tc  capture 
tne  metadata  or  all  these  app  lications.  Inc  convert  ;  j c  - 
trons  of  a  IDS  scan  source  programs,  database  cesciijti.ia, 
ana  teleprocessing  ervironmert  descriptions  and  autdati- 
cally  produce  maintenance  transactions,  tnus  sparing  the 
data  administrator  mary  hours  cf  manual  effort.  figure  3.3 
from  £Eef.  5]  illustrates  the  flow  of  data  througn  a  typical 
convert  function. 


figure  3.3  System  Flow  for  a  Convert  Function. 

Inputs  include  the  source  language  statements  and 
cutouts  are  a  file  cf  transactions  to  re  input  to 


the  Z / £  ; 


the  D/D  maintenance  module,  (in  the  case  cf  SPLICE  that 
refers  tc  the  maintenance  acauie  cf  the  master  copy)  and  a 
report. 

The  D/D  maintenance  transactions  include  descrip¬ 
tions  cf  databases,  files,  records,  groups,  elements  and 
programs.  ihe  prime  purpose  cf  convert  functions  is  to 
convert  metadata  from  toth  user-written  programs  and  from 
local  £  E  £  £  and  its  related  components.  Table  IX  illustrates 
in  summary  the  typical  D/D  convert  function  transactions. 

Pour  major  characteristics  £fief.  5]  for  convert 
functions  are: 


jl  he  content  of  the  genera  ted  transactions  where  tne  i/'. 
mainterar.ce  transactions  created  by  a  convert  function 
usually  also  contains  the  relationships  netweer  dat< 


net  ween  aata 


entit ies . 


The  input  file  to  a  convert  function  that  can  be  a  scurce 
program  cr  a  library  file. 

The  SilJJJQh  options  which  may  include  the  ability  tc  change 
names,  elect  lines  tc  scan,  select  types  of  transactions  to 
create,  and  override  generation  of  some  types  of  metadata, 
where  tie  ability  to  analyze  the  metadata  cf  source  programs 
can  make  the  DDS  a  valuable  tccl  for  auditing  adherer.ce  tc 
software  ccrtrol  tectrigues. 

2  .  Environmental  Depen  dencv 

This  characteristic  cf  a  DDS  is  determined  ty  its 
reliarce  cn  a  specific  hardware  configuration,  an  operating 
system,  a  DBMS,  or  a  teleprocessing  monitor.  Under  ideal 
conditicrs  a  DDS  must  have  the  capability  to  operate  in  such 
an  environment  without  losing  efficiency  and  functionality. 
Hut  sometimes  the  practice  deviates  from  theory. 


In  a  completely  integrated  DDS  the  DBM  S  accesses 
stored  databases  via  the  D/D.  Is  a  less  integrated  system, 
the  LEMS  may  maintair  its  c  wn  directory  trie  for  accessing 
stored  databases. 

In  the  independent  atprcacn  tne  IDS  is  completely 
autorcmcus,  it  dees  ret  rely  cn  any  t  articular  DBMS,  are  the 
E  EM  S  nairtains  its  cir  scarce  cf  metadata. 

In  the  DBMS  application  att.roacn  the  D/D  appears  to 
the  LEMS  as  just  another  database.  Int  DBMS  mairtains  its 
cwn  metadata  for  each  database  and  tacse  metadata  are  sepa¬ 
rate  ircn  the  D/E. 

for  the  SPLICE  system,  it  is  proposed  that  the 
approach  be  used,  where  tne  DDS  is  actually  a 


competent  cf  the  DBMS's.  Ihis  approach  provides  complete 


Figure  3.4  SfLICE  Embedded  Approach  to  DDS. 


metadata.  The  DBMS  utilities  provide  the  D/D  management 
facilities  and  the  DEtS  uses  the  D/D  to  directly  access  the 
stored  databases.  Nc  cthe.r  directories  internal  cx  external 
exist  fer  the  DEMS,  and  the  DBMS  and  its  facilities  rely 
completely  cn  the  D/E  for  metadata.  Such  a  structure  is 
shown  in  figure  3.4. 


Sc  xoc  example  a  ^uery  processor  extracts  user  views 
from  the  ELS  and  the  DBMS  applies  integrity  constraints 
specified  in  tie  DDS  ty  the  EES  administrator  her  ore  s ferine 
a  data  element.  A  oajor  difficulty  nere,  that  the  S£IIC£ 
designers  lust  overcome,  is  the  fact  that  the  CcJIS  for 
S  PL  I C  £  already  exists  tut  the  EES  does  not.  The  embedded 
a^prcaco  is  easier  ard  simpler  when  both  EDS  and  ErMS  are 
develcped  in  parallel,  nut  this  is  net  the  case  fer  the 
SPLICE.  Sc  special  attention  and  effort  must  he  applied 
during  the  IDS  development  phase. 


TABLE  7III 

Types  cf  Software  Packages  I  D/D  Systea 


M cd ule 

CII  Processor 
Catalase  control  system 
E reprccessor 

Ctery/ugdate  Processor 
Eatch-ccde  generator 

Scurce-^rogram  manager 

Teleprocessing  monitor 
Test-data  generator 


Descri ption 

Creates  a  schema  rile 

fiun-time  unit  of  a  DECS 

Translates  CrlL  into  CAL 
sta  tements 

Provides  direct  end-user 
access  to  stored 
databases 

Seduces  the  time  tc 
develop  a  standard 
function  as  compared 
to  a  compiler- lev  el 
language 

Provides  security- 
protection,  data 
compression  and  editing 
capabilities  for  source 
programs 

Provides  the  capability 
of  interactive  ccmputin 
to  remote  terminals 

Creates  test  files 
and  databases  acccrairg 
to  user  specifications 

Analyzes  and  generates 
designs  of  databases 
or  information  systems 


Design  aid 


TSEIE  IX 

Iransacticrs  for  D/C  Convert  Function 


type  Si£J£^.ted  transactions 

Eicgianaing  Element,  group,  record,  file, 

and  sometimes  Subschema 
and  process 

Catalase  description  Database,  rile,  subschema, 

r eia tionsuip,  record, 
group,  element 

leleprcoessirg  Terminal,  line,  processor, 

transaction 


IV.  SESSI05  SEBVICES  AND  DATA  DIC1IGNABI 

A.  GENERAL 

lit  ten  "session"  is  defined  in  [Ref.  4]  as  follows: 

"Session:  All  the  activity  (message  exchange  and  processing) 
which  takes  place  between  twc  or  more  processes  for  the 
duration  of  a  sirgle  task  (e. g.  text  editing  or  prccessing 
cf  a  trarsaction  file)." 

The  session  services  module  of  the  SPLICE  has  to  play 
the  rcle  cf  coordinating  the  activity  of  the  other  func¬ 
tional  modules  and  providing  them  with  work  instructions  via 
the  service  codes  it  inserts  in  messages  to  the  FM's.  The 
sequence  cf  operations  may  he  data  dependent  or  highly 
interactive,  so  in  seme  cases,  work  breakdown  cannct  be 
completely  determined  in  advance  by  the  session  services. 
In  such  cases  session  services  passes  control  to  the  first 
(controlling)  FM  which  is  to  perform  an  operation,  and 
subsequent  "calls"  to  other  FM's,  if  any,  take  place 
according  tc  processing  conditions.  In  all  cases  however, 
session  services  passes  control  to  the  first  (controlling) 
FM.  However  in  some  cases,  all  the  FM's  which  will  be 
involved  cannot  be  determined  in  advance.  Session  services 
retairs  and  maintains  state  information  until  either  a 
completion  message  or  error  message  has  been  received  from 
the  ccntrclling  FM.  In  the  case  of  a  message  which  is 
destined  fer  an  object  located  in  another  network,  this  fact 
is  indicated  in  the  "message  type"  field.  The  physical 
destination  address  wculd  have  been  obtained  previously  from 
the  data  dictionary  which  exemplifies  the  relationship 
between  session  services  and  data  dictionary. 


Session  services  is  used  in  a  distrituted  environment 
aid  irvclves  the  seven  layer  architecture  model  of  the  ISO 
lor  distrituted  networks.  The  ISO  seven  layer  i icnitecttre 
is  a  starcard  one  ard  involves  the  hollowing  layers  watt  the 
associated  lunctions: 

layer  Friction 

Application  User  process 

Presentation  Fermat  data  the  user  wants  it 
Session  Sets  up  session  between 

communicating  processes 
Iranspcrt  Ere  to  etd  control 

network  Switching,  routing 

Eata  lirk  Reliable  transmission  between 

two  nodes 

Physical  Physical  transmission  of  bits 

between  two  nodes 

Ihe  complexity  cl  the  SPLICE  processing  environment 
requires  that  user  terminal  processes  be  given  ccrsideralle 
assistarce  in  carrying  out  their  tasxs  [fief.  4J.  Session 

services  can  provide  this  assistance.  User  terminal 

processes  steciry  tas*  environments,  largely  by  task  name 
and  the  assistance  oi  the  data  dictionary,  where  necessary 
(Figure  1.1). 

E.  AJ5C  E II ECTUfi  E  INI  EPF  ACES 

In  the  SPLICE  layered  architecture,  the  interlaces 
tetweer  the  layers  are  critically  important.  In  particu¬ 
larly  we  are  very  interested  in  the  software  interlaces 

between  the  modules  which  communicate  with  the  data 

dictionary.  These  modules  are  the  session  services  module 
and  the  SEES  module.  Some  forms  of  software  interlaces 
between  EEMS  and  D/D  can  ne  found  in  the  current  literature 
£fie£.  5].  On  tne  ether  hard  no  one  has  yet  defined  the 
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require d  s  oft  ware  interfaces  retween  tne  D/D  and  session 
services  modules.  We  .believe  that  tne  above  mentioned  soft¬ 
ware  interfaces  must  te  or  the  same  type  and  closely  related 
to  the  interfaces  between  the  end  user  and  tne  session 
services.  In  a  centralized  system  where  session  services 
does  net  exist,  the  end  user  has  to  interface  directly  with 
the  C/D,  hut  in  a  distributed  system  the  session  services 
module  acts  as  the  mediator  between  the  end  user  arc  the 
data  dictionary.  As  a  minimum  then,  the  interfaces  between 
session  services  and  the  data  dictionary  in  a  distributed 
system  must  include  the  interfaces  between  end  user  and  data 
dictionary  in  the  centralized  model. 

The  interfaces  between  the  above  modules  must  be  designed 
to  accommodate  new  mechanisms  and,  as  far  as  possible,  new 
functions  when  they  may  arise.  As  new  mechanisms  and 
network  functions  come  into  use  in  the  system,  it  is  highly 
desirable  that  previously  written  programs  continue  to  work. 
This  is  achieved  by  designing  the  interfaces  appropriately 
and  preserving  them.  In  the  seven  layer  architecture, 
layers  4,5,6  and  7  provide  ena-tc-  end  communication  between 
sessions  in  user  machines.  layers  1,2  and  3  provide  commu¬ 
nication  with  the  nodes  of  the  shared  network. 

Eecause  the  SPLICE  system  uses  a  modified  ISO  layered 
approach,  the  interfaces  between  machines  need  to  te  defined 
in  terms  of  the  layers.  Sc  we  will  nave  layer  headers  and 
control  messages  that  are  passed  between  the  layers.  The 
application  programmer  does  not  need  to  know  anything  about 
these.  For  example  any  command  language,  using  commands 
simmilar  to  GET,  PUT,  OPEN,  CLOSE  and  DELETE,  can  refer  to 
data  or  facilities  in  a  distant  machine. 


c  C 


c. 


TEE  SESSIOH  SEEV1CES  MOEOIE 


There  are  differences  in  the  session  services  provided 
depending  u pon  type  cf  network.  In  the  distributed  environ¬ 
ment  different  types  of  user  software  need  different  types 
cf  session  services.  These  differences  involve  net  only  the 
software  tut  also  the  areni tec ture.  So  one  set  of  session 
services  ray  be  provided  for  one  manufacturer's  arcnitectcre 
and  a  different  set  for  another.  This  is  very  important  for 
the  SELICZ  because  tte  hardware  used  throughout  the  system 
varies.  It  may  be  possible  that  services  provided  across 
the  system  are  of  different  types.  However  it  is  desirable 
to  have  ccmrnon  sessicr  services,  because  this  will  facili¬ 
tate  tbe  maintenance  task.  Also  for  interfacing  purposes 
want  session  services  to  present  a  common  image  tc  the 
system.  This  can  be  accomplished  by  bidding  necessary 
interface  urits  from  the  session  services.  In  [Bef.  10  pp 
h91  ]  there  is  a  description  cc  possible  functions  cf  the 
session  services  subsystem  in  a  distributed  network.  These 
functions  are  generally  divided  into  three  large  groups: 

-Euncticns  required  when  setting  up  or  disconnecting  a 
session. 

-Euncticns  used  during  the  normal  running  of  a  session. 
-Eurcticns  employed  when  something  goes  wrong,  such  as  a 
rede  failure  or  a  protocol  violation. 

Mere  precisely  these  functions  are  divided  ir  the 
following  categories: 

--Assistance  in  establishing  a  session 
--Easic  networking  functions 
--Application  ma cr cinstr uc tiens 
--Ercgram  control  facilities 
--File  access  functions 
--Eeccvery  and  error  control 
--Editing  and  trarslaticn 
--Dialogue  software 
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E. 


--virtual  operations  and  tx anspar enc y 
--Compaction 
--Eaymert  functions 
--Security  and  audit  functions 

IKTEFFACES 

Functional  interfaces  detween  session  services  and  data 
dictionary  must  permit  ether  software  modules  to  access  the 
E/D  and  convert  metadata  into  the  format  required  ty  the 
EDS. 

A  EES  provides  many  functions  and  features  such  as: 

Maintenance 

E  atensibilit  y 

Eepcrt  processor 

Ccery  processor 

Convert 

Software  interface 
Fait  facility 

lie  software  interface  function  must  provide  a  formatted 
pathway  enabling  the  EDS  to  provide  metadata  to  other  soft¬ 
ware  systems  such  as  compilers  and  DDL  processors  [Bef.  5], 
to  retrieve  information  from  the  DOS,  to  update  information 
where  it  is  permited,  and  to  obtain  the  restriction  proto¬ 
cols  for  data  consistency  and  integrity.  The  software 
interface  can  generate  file  descriptions  for  storage  ir.  a 
program  library,  or  accept  the  user  identification  and 
generate  a  copy  of  that  user’s  database  view.  Tt  is  not 
possible  fer  this  study  to  describe  precisely  the  software 
interfaces  needed  for  the  SE LICE  system.  Because  this 
system  is  under  development,  many  aspects  or  the  system  are 
still  unknown  and  the  software  modules  are  not  yet  described 
in  full  detail.  So,  we  will  only  outline  some  of  the  soft¬ 
ware  interfaces  without  claiming  that  these  are  sufficient 


for  the  SfllCE  systei.  Interfaces  car.  be  aadea  tc  tne 
system  during  the  later  stages  cf  the  system  life  cycie  ana 
existirg  interfaces  can  also  he  changed  or  improved  as 
nee  de  c . 

Eecause  COBOL  is  used  throughout  tne  system,  tie  COBOL 
"GEHEFATE"  command  car  create  from  the  D/D  fully  formatted 
file  and  record  definitions  that  can  be  stored  in  a  library 
file.  Included  can  he  most  CCEC1  clauses  such  as  88  levels, 
EYNCEFCNIZZI,  BEDEFIMS,  and  CCCUES.  Ihe  OFTION  clause  cx 
this  ccaaanc  can  perait  charges  in  names,  the  designation  cf 
sequence  nuabers,  level  numbers  and  identifiers,  arc  the 
inclusion  cf  program  comments.  An  example  of  the  use  of 
tnis  ccaaand  can  be  found  in  £Ref.  5  pp  261].  Ihe  genera¬ 
tion  cate,  last  revision  date,  and  revision  number  can  be 
automatically  recorded  in  both  the  listing  and  the  C/D. 

lie  output  file  can  also  contain  jcb  control  statements 
to  be  included  on  the  output  file.  Then  the  output  file  can 
be  executed  as  a  jcb  that  creates  and  catalogs  the  CCECL 
metadata  as  a  member  of  a  library  under  control  cr  ar.y  cf 
the  varicus  source  program  managers. 

A  D«I  processor  can  be  used  also  to  interface  between 
the  sessicn  services  and  data  dictionary.  A  source  pregram 
triggers  the  DM!  processor  by  sending  a  service  code, 
through  the  session  services,  and  the  DHL  processor  inter¬ 
acts  with  the  data  dictionary/directory.  The  output  cf  the 
DML  processor  is  an  expanded  source  program  that  is  sent  to 
a  compiler  for  compilation. 

Cther  kinds  cf  interfaces  include  guery  processors, 
source  program  managers,  varicus  user  interface  facilities, 
and  ether  software  packages. 


Figure  4.2 


Software  Interface  Using  a  DM1  Processor 


V.  £ZD  IN  CIST BIE DIED  ENVIRONMENT 


A.  I  NIECE CCTIOH 

In  this  chapter  we  wj.il  ccrsider  the  design  and  fur.ction 
cf  DEE  n  the  distriiuted  database  environment.  Seme  exten¬ 
sions  tc  the  centralized  D/D  are  needed  in  erda  r  tc  enable 
it  tc  function  effectively  in  a  distributed  environment. 

lie  distributed  system  is  a  sunset  of  a  general  mrerma- 
ticn  system.  It  is  not  necessary  for  the  user  tc  knew  r.cw 
cr  where  the  data  is  stored  or  in  what  way  the  data  will  he 
accessed  hy  a  program  or  hew  and  where  the  processing  is 
accc nplishe d.  Unless  the  dictionary  plays  a  highly  active 
role  in  the  running  of  the  distributed  system,  there  is 
little  need  to  try  tc  share  ere  dictionary  over  the  ertire 
network.  Ihis  is  because  there  is  not  likely  to  be  a  large 
amount  of  update  activity  in  a  dictionary-  The  dictionary 
can  ncraally  be  reproduced  at  each  node  and  this  is  the 
proposed  solutions  fer  SPLICE.  Ey  using  such  an  architec¬ 
ture,  problems  of  updating  the  dictionary  across  the  network 
can  be  sclved  without  much  overhead. 

Cf  course  the  problem  cf  distriiuted  control  in  a 
network  is  more  complex  than  that  of  the  Hierarchical  archi¬ 
tecture  cf  dictionary  systems  which  has  been  discussed  in 
chapter  twc.  This  is  one  reason,  in  addition  to  tne  lack  cf 
experience  with  distriiuted  data  dictionary  systems,  why  we 
proposed  replication  instead  of  distribution  of  the  data 
dictionary  for  SPLICE.  The  mere  the  dictionary  system  acts 
as  either  the  ccntrci  mechanism  or  a  repository  cf  control 
information,  the  mere  complex  the  DBAS,  network  operating 
systems,  and  dictionary  system  interactions  become.  for 
example,  in  the  case  wnere  we  want  to  determine  the  test 


distributee 


an  u 


location  ter  racnicc  a  guery  against  a 
partial!}  replicated  catarase  £Hei.  6]  tne  dictionary  system 
is  reguirec  to  retain  inf  cria  ti  cn  on  the  iocatici.  cf  all 
data.  Indeed,  this  may  re  highly  dynamic  itself,  arc  there¬ 
fore  the  line  between  a  dictionary  and  "real"  database 
becomes  verg  fuzzy. 

Creation  or  a  distributed  lnrormaticn  resource  implies 
that  the  rusher  of  hardware  and  software  components  are  to 
he  designed  and  integrated  into  a  controlled  environment. 
Ihese  components  in  the  SPLICE  include  several  databases  ana 
database  management  systems,  user  language  interfaces,  data 
diet icn ary/direc tory  catalogue,  transaction  controllers  and 
data  inp c t/cutp u t  ccrtrol  modules.  *e  will  describe  the 
varices  system  components  and  we  will  also  attempt  to  demon¬ 
strate  the  integration  of  them  with  the  international  orga¬ 
nization  for  standards  (ISC)  communica tions  architecture, 
and  a  data  storage  ard  retrieval  architecture  (DSHA) . 

Ir  general,  a  distributed  system  must  provide  to  the  end 
user  transparency,  data  sharing,  data  transfer,  process 
transfer,  cr  a  facility  for  combination  of  strategic,  nara- 
gerial  ard  operational  reporting.  In  order. to  do  that  there 
are  several  environmental  constraints  that  must  be  satisried 
[fief.  12*.  Ihese  are: 

Eata  cc mmunicatic rs 

Eata  storage  and  retrieval 

Me  tacat  a 

User  language  support 
Process  and  report  management 
Information  representation 
System  management 
Irtecri ty 
Eecurit j 

for  the  SPLICE  system,  communication  must  be  integrated 
with  cooperative  processing  of  tne  various  different 


exis  t ir  g  set  tv  are  anc  hauwart.  In  order  to  do  that  we  need 
to  address  the  considerations  of  tne  database  interface  with 
distributed  system  tasks. 

A  distributed  database  is  particularly  useful  tc  appli¬ 
cations  that  involve  extensive  processing  in  different  loca¬ 
tions.  £  1  LI C £  fits  exactly  in  tne  above  concept  as  do 
airiires,  banking,  retail,  and  military  command  and  control 
applications.  The  distributed  database  of  the  SPIICI  can  be 
allocated  among  the  nodes  cf  the  network  according  to 
various  existing  criteria  for  fragmentation.  Tc  avcid 
confusion  in  distributed  systems  two  different  terms  are 
used  :  partitioned  cataLase  which  consists  cf  non  overlap¬ 
ping  subsets,  and  replicated  database,  which  has  seme  data 
redundancy  [Ref.  5].  Replication  enforces  the  locality  and 
availability  of  the  database  and  reduces  the  freyuercy  of 
accessanc  the  DCN,  but  recuires  the  DBMS  tc  provide  mere 
sophisticated  concurrency  and  recovery  procedures.  Tc  avcid 
expensive  overhead  in  data  management,  restrictions  must  be 
established  as  to  the  degre e  cf  data  replication  permitted. 
SPLICE  belongs  in  the  class  cf  replicated  database  because 
the  same  item  of  the  database  can  be  located  in  several 
loeatiers  and  the  lccal  databases  provides  information  ror 
items  stcred  in  cnly  cne  location. 

Mapcx  problems  in  tne  development  of  techmgues  for  a 
distributed  database  are  due  to  communication  volumes  and 
delays  and  to  the  potential  for  parallel  processing. 
Sometimes  it  is  very  difficult  tc  apply  working  solutiers  to 
distributed  data  processing  which  are  borrowed  frem  tne 
centralized  processing  concept.  These  solutions  often  work 
well  crly  in  one  ervironment  and  do  not  transfer  effi¬ 
ciently.  So  excessive  delays  may  occur.  Parallel 
processirg  also  has  the  potential  to  increase  throughput, 
tut  teguires  complex  controls  to  synchronize  concurrent 
activities  at  dispersed  sites.  Eecause  a  data  dictionary  is 


a  database  containing  metadata,  one  same  prorltms  existing 
in  distributed  databases  a  lsc  exist  in  a  distributed  data 
diet  icrary . In  ££ef.  5]  are  described  rive  rasic  problems 
which  must  be  addressed  in  distributed  data  management: 

- Ibe  cc erd inatict  or  the  DEdS  with  the  data  transmission 
network  such  tnat  reliable  delivery  of  messages  car  re 
ensured. 

-The  deccif ositi or  of  transactions  into  atomic  parts, 
selection  or  nodes  to  execute  those  parts,  and  ccrtrcl  of 
any  movement  of  data  between  sites  necessary  to  process 
trans  ac  t ion s . 

-Ihe  synchronization  or  leg icaliy  related  updates  and 
retrievals  that  are  processed  at  different  nodes. 

-Ibe  detection  ard  resolution  of  conditions  where  a  part 
of  the  database  becomes  inaccessible  due  tc  node  cr  line 
failure. 

-Ibe  management  of  metadata  describing  the  distributed 
database  and  environment.  This  last  problem  refers  particu¬ 
larly  tc  the  data  dictionary  and  deserves  special  attention. 

E.  E  JlEiEICNS  1C  THE  EES 

Ibe  role  of  a  D/C  in  a  distributed  database  environment 
is  very  significant  because  it  contains  important  informa¬ 
tion  about  the  description  of  the  database  distribution ,  tne 
cnarac t eristics  cf  tbe  nodes  and  other  aspects  oi  the  data 
comm  u  r  ic  a  t  i  on  network.  Seme  additional  entities  must  be 
included  in  the  IDS  £Eef.  5]  : 

-Ihe  database  entity  which  describes  tne  global  view  cf 
the  database  and  includes  attributes  for  relation  and  attri¬ 
bute  tames,  validity  constraints,  as  well  as  identification 
cr  local  databases. 

-Ihe  fragmert  entity  which  describes  portiers  cr  tne 
local  catalase.  This  entity  is  not  useful  for  tne  ScllCE 
because  there  are  not  fragments  of  the  local  database. 
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-lie  tocology  entity  which  descaces  the  physical 
configuration  of  network  ccmpcnents  and  the  links  retween 
the  redes. 

-The  code  entit  2  which  describes  the  comb  in  a  tier,  of 
network  subcomponen  ts  at  a  particular  site  of  the  network. 

-finally  some  ether  entities  (terminal,  line,  multi¬ 
plexer,  processor)  describing  network  design. 

lie  cannct  say  exactly  what  new  entities  should  he  added 
to  the  SfllCE  DDE,  but  at  least  initially,  we  believe  that  a 
form  cf  tcpclogy  and  node  entities  must  be  included.  These 
entities  are  needed  when  ncn-iccai  reguests  are  processed, 
because  the  software  performing  transaction  management  needs 
to  reference  the  D/D  to  determine  the  location  of  the  needed 
data,  the  user's  access  privileges,  the  status  in  addressed 
nodes,  etc.  The  interfaces  needed  for  this  purpose  can  be 
dynamic  cr  static  exactly  as  it  is  in  the  centralized  case. 

C.  TEE  CDS  AS  A  DIS  TEIBOTE  D  EATABASE 

Eractically,  the  D/D,  when  supporting  a  distrihuted 
system,  becomes  itself  a  distributed  database.  The  contents 
cf  the  C/E  may  reside  at  various  locations.  We  cannct  say 
that  this  approach  fits  exactly  in  the  SPLICE  case.  The 
apprcach  we  have  proposed  for  the  SPLICE  is  guite  different. 
Ko  partition  of  the  I/D  is  permitted.  That  means  the  E/D 
cannct  be  a  distributed  database  as  we  know  it  in  the  orig¬ 
inal  form,  for  the  solution  proposed  ror  SPLICE  CDS,  we  can 
say  that  it  is  based  on  replication  instead  of  distribution 
cf  the  CCS.  On  the  ether  hand,  there  are  seme  other  reason¬ 
able  solutions  which  follow  mere  closely  the  distributed 
concept.  Since  experience  with  distributed  systems  is  rela¬ 
tively  small,  the  steps  needed  to  reach  a  decision  must  be 
taken  very  carefully  in  Oder  tc  avoia  mistakes. 


The  designer  of  a  DDS  encounters  some  similar  nasic 
problems  as  does  the  designer  of  a  distributed  database. 
Seen  i.e  design  a  D/E  we  must  dettrmine  tne  extent  cf  envi¬ 
ronmental  dependency  between  the  D/D  and  the  DBMS.  5s  we 
said  befere,  the  distributed  D/D  is  an  extension  cl  the 
centrali2ed  cne  and  sc  the  three  nasic  variations  tc  the 
type  cf  relationships  between  a  DDS  and  a  DBMS  are  still  in 
force.  In  the  independent  distributed  approach  the  ECS  has 
no  running  connections  to  any  portions  of  the  DBMS  and  is 
not  actively  or  directly  used  in  transaction  processing  by 
the  EEMS-  In  the  DEhS-arp licaticn  approach  the  D/D  is  just 
another  distributed  database  tc  the  D3MS  and  separate  data 
management  functions  are  not  needed  to  handle  tne  D/E.  Ihe 
DBMS  nay  manage  its  own  run  time  directory  that  is  separate 
from  the  L/C.  In  the  emnedded  distributed  approach  tie  D/D 
provides  the  run-tine  directory  for  the  OEMS.  All  the 
components  cf  the  DECS  obtain  their  metadata  from  the  D/D. 
Ihe  size,  location,  and  contents  of  the  D/D  would  also 
affect  the  performance  of  other  DDS  functions  such  as  main¬ 
tenance,  reporting,  and  ^uery  [Bet.  5]. 

E.  A  tCEEI  FOR  A  DIS1RIB0TED  EDS 

In  this  section  we  are  geing  to  examine  a  distributed 
model  for  SELICE  DDS.  Its  structure  is  shown  in  figure  2.4, 
and  involves  the  partition  cf  the  global  DDS  into  different 
views  containing  information  for  one  or  more  local  data¬ 
bases.  Ihese  different  views  can  be  located  at  each  or 
selected  LAN's. 

Ihe  global  (or  network)  dictionary  is  the  nucleus  around 
which  all  the  management  functions  of  a  DDS  are  centered. 
It  certains  £Ref.  11]  information  to  start  every  maragement 
process  cf  the  SELICE  distributee  database.  In  particular 
it  certains: 

a  .-Inf erma ticn  fer  the  DDS  design 


-file  access  frc^rais 

-Ictal  volumes  of  queries  for  each  file 
-Total  volumes  of  updates  for  each  rile 

This  statistical  information  is  very  useful  especially 
for  evaluating  the  optimal  cumber  of  redundant  copies. 

t  .-Infcima ticn  fcr  the  distribution  function 

-Number  and  types  cf  transmission  links#  their  urit 
ccst,  their  mean  utilization  factor 
-Ecuting  tables 
-CEU  workloads 
-Disk  utilization 

Ibis  information  can  help  determine  the  optimal  alloca¬ 
tion  of  redundant  file  copies  and  of  possible  operation 

parallelism . 

c.  -General  information  about  data  and  how  data  is  scared 
amonc  tie  various  nodes  of  the  system.  What  the  Eumter  cf 
E/D  copies  is  and  where  they  are  located. 

d . -Irf crma ticn  about  existing  constraints,  status  cf  the 
system#  node  failures  etc. 

e  .-Irf crmati cn  abcut  data  transportability 

f  .-Irf ormaticn  related  tc  data  used  by  applications 
having  a  global  view.  Such  applications  are  fcr  example 

those  where  different  local  databases  are  involved  for 

executicr.  We  said  in  a  previous  section  that  sometimes 
data  redundancy  is  preferable  over  the  frequent  use  cf  the 
EC N .  That  means  information  about  the  sites  where  a  compo¬ 
nent  (i.e  spare  part)  is  located  must  be  somewhere  m  a 
central  position.  Sc  in  the  case  where  the  component  cannot 
be  found  in  the  local  database,  the  user  has  to  access  the 
global  data  dictionary  to  find  tne  places  where  the  partic¬ 
ular  item  is  located. 


Tc  be  able  to  desrgn  and  run  a irsc-a t » n  or  retr-ieval 
programs  the  global  E/D  must  contain  information  [ Eef .  11] 
about : 

Data  structures 
Data  location 
Lata  availability 

Eata  accessibility  (related  to  security,  compatibility 

etc) 

Eata  translation  naps,  access  paths 

Eata  entities 

Conner  procedures 

Events  and  their  interrelations 

Ibis  dictionary  must  be  able  to  answer  queries  abcut  D3 
and  EEMS's  involved  in  a  transaction  and  how  the  transaction 
can  be  formulated  to  cdtain  the  most  efficient  result. 

local  dictionaries  include  information  abcut  local  data¬ 
bases  anc  applications,  local  data  entities,  local  proce¬ 
dures,  local  interrelations,  physical  storage  structures  of 
local  data,  access  methods,  access  patns,  physical  storage 
devices-,  and  redundancy  of  data  items. 

In  [Eef.  11]  a  structure  is  proposed  for  a  distributed 
E/D  guite  different  from  the  SPLICE  approach.  Ihis  struc¬ 
ture,  as  shewn  in  Figure  5.1,  involves  the  existence  of: 

Ketwcrk  dictionary 
Glctal  external  dictionary 
Glcbal  conceptual  dictionary 
Iccal  external  dictionary 
Iccal  conceptual  dictionary 
Internal  dictionary 

and  each  one  of  the  above  performs  a  different  fa  notion. 

Ibis  architecture  which  is  purely  distributed,  is  prob¬ 
ably  tcc  complicated  to  be  implemented  for  the  SPLICE.  It 
is  a  theoretical  model  and  if  we  try  tc  implement  it,  we  may 


Figure  5.1  1  Purely  Districted  Approach  fcur  a  DCS. 

face  serious  interface  problems,  resulting  in  the  data 
dictionary  becoming  the  main  resource  consumer. 

Ihe  functions  we  intend  to  include  in  the  SFIICF  CDS 
will  play  a  major  role,  if  we  want  to  avoid  complex  struc¬ 
ture  and  saturation.  These  functions  must  he  the  minimum 
possible  needed  for  the  proper  operation  of  the  system.  Ke 
believe,  in  the  case  where  the  distributed  instead  cf  repli¬ 
cated  approach  will  be  followed,  the  architecture  shewn  in 
Figure  2.4  is  the  mere  practical. 


72 


Ecllcwirg  tie  atcve  architecture  a  global  dictionary 
located  ir  some  code  has  the  rcle  of  maintaining  consistency 
throughout  the  whole  SPLICE  system.  Reguests  for  updates, 
deletions,  and  additions  are  routed  througn  the  data 
dictionary  administrator  and  after  an  evaluation  procedure 
the  global  dictionary  is  updated.  Then  the  changes  are 
transnitted  to  various  locations  where  the  local  copies  are 
updated.  Also  updates  are  transmitted  to  the  data 
directory. 

Lata  directories  can  be  located  at  the  inventory  control 
points  (ICfj.  In  contrast  with  the  data  dictionary,  the 


data  directory  contains  glctal  information  only  about 
subject,  service  code,  object  name,  and  address.  All  the 
ether  information  is  located  in  the  global  and  the  various 
local  dictionaries.  The  data  dictionary  administrator  is 
responsible  for  mairtaining  the  data  directory,  as  veil, 
lifferent  views  of  the  glctal  dictionary  are  located  in 
various  IAN's.  Each  view  can  serve  one  or  more  LAN's  ar.d  it 
is  preferable  to  be  located  at  the  LAN  where  it  is  most 
freguently  used  in  erder  to  avoid  unnecessary  usage  cf  the 
EDN. 

fchen  an  item  is  net  found  in  the  local  database  the  user 
routes  a  value  lccaticn  reguest  through  the  session  services 
(service  code)  to  the  data  directory,  and  the  data  directory 
replies  with  the  location  address.  Using  the  previous 
information  the  user  can  reguest  and  establish  a  session 
witn  the  remote  database  where  the  reguested  information 
resides . 


VI.  CONCICSICNS  AN E  BECGMMENDATIQNS 


A.  CC  NCI 0  £  ICN S 

Cur  objectives,  as  described  in  the  first  chattel#  were 
to  investigate  the  area  of  data  dicticnar y/dir ectcry 
systems,  in  a  distributed  environment,  to  outline  tne 
advantages/ disadvantages  of  these  systems,  to  present  tne 
underlying  ideas,  tc  examine  the  benefits  for  the  SELICE 
system  from  using  a  diction  ary/direct ory  system,  and  finally 
to  delineate  the  interface  reguirements  between  a  data 
dictionary/ direc tory  system  and  other  functional  modules. 
In  addition  to  the  above  objectives  we  discussed  also  seme 
ideas  concerning  the  organization  of  the  data  administration 
function,  and  four  hierarchical  archi lectures  for  EDS ,  each 
one  with  a  different  degree  of  distribution. 

The  first  architecture  is  based  on  the  replication  cf 
the  E/D.  There  are  no  different  views  of  the  D/E,  orly 
exact  copies  of  one  view  located  iS  eacn  LAN.  Using  this 
architecture  we  have  62  replicated  copies  of  tne  E/D  (the 
same  as  the  number  of  IAN1 s) ,  each  containing  the  informa¬ 
tion  (metadata)  about  all  SII3CE  data  base  definiticis  and 
functions  residing  in  each  IAN.  This  architecture  minimizes 
access  tc  the  DDN  but  has  the  drawback  cf  reguiring  a  let  cf 
secondary  storage.  Ihe  size  cf  the  D/D,  statistical  and 
ether  information  concerning  the  frequency  of  using  the  EE N, 
and  the  amount  of  information  included  in  the  D/D,  all  will 
have  aD  impact  on  the  effectiveness  of  this  architecture. 

The  second  architecture  which  allocates  replicated 
copies  cf  the  D/D  tc  selected  nodes  (the  most  active)  is 
more  conservative.  In  the  case  or  a  nuge  dictionary,  this 


regnir-es  heavier  use-cf  tne  DEN.  Here  the  size  cf  the  E/D 
and  tie  appropriate  redes  at  which  to  install  the  replicated 
copies  seriously  affect  the  ef f ec ti veness  cf  tris 
architecture. 

lie  third  architecture  is  Eased  on  distribution  cf  tne 
E/D.  Different  views  of  the  D/D  reside  in  each  IAN  and 
contair  information  crly  concerning  the  local  data  rase. 
This  architecture  involves  the  use  of  a  data  directory  (we 
propose  two  replicated  copies,  one  located  in  each  ICE). 
The  use  cf  the  data  directory  (which  contains  limited  infor¬ 
mation)  provides  a  Und  of  "relation  or  connection"  between 
the  various  views.  Also  a  global  dictionary  is  needed  in 
order  tc  provide  consistency  and  global  function  facilities 
throughout  the  system.  This  architecture  is  more  dynamic 
than  the  previous  two  discussed  so  far.  It  has  the  advantage 
cf  saving  secondary  storage  but,  on  the  other  hand, 
increases  even  mere  the  use  of  the  DDN. 

A  fourth  architecture  was  discussed  just  tc  mention 
another  possibility  for  a  distributed  architecture,  tut  cur 
estimation  is  that  it  would  be  too  expensive  in  system 
resource  consumption  for  the  SELICZ. 

Three  environmental  dependency  options  for  the  IDS 
(independent,  completely  integrated,  and  DBMS  dependent) 
were  also  discussed.  The  main  reason  for  choosing  the 
embedded  (DBMS  dependent)  approach  is  because  the  data 
dictionary  is  going  to  be  used  only  for  the  SPLICE  system 
(so  the  independent  approach  does  not  make  any  sense),  and 
also  the  SELICZ  data  base  already  exists.  Also  the  embedded 
approach  (DEMS  dependent)  was  chosen  because  of  the  homoge¬ 
neity  cf  the  DBMS  environments  across  LAN's.  The  indepen¬ 
dent  and  completely  irtegrated  approaches  are  too  ccstly  at 
this  time  although  the  latter  could  be  implemented  eventu¬ 
ally  ficm  an  embedded  environment. 


E ICC  Efl  E  SXATICHS 


From  the  investigations  performed,  we  have  tne  fciicwing 
iiain  reccmiendations  for  the  SELICS  system: 

a.-  Ihe  TANDEM  data  dictionary  that  already  exists 
should  he  tie  basis  icx  the  SEIICE  data  dictionary. 

t.-  Ihe  D/D  should  be  iapleaented  cnly  fcr  new  applica¬ 
tions  because  it  is  a  herculean  task  to  retrofit  the  C/D  to 
the  existing  old  applicatio ns. 

c„-  The  embedded  (DBMS  dependent)  approacn  should  be 
used  fcr  the  D/D. 

d. -  Twc  candidate  architectures  snould  re  exaaired 
further  based  on  statistical  and  otner  information  (not 
available  fcr  the  present  thesis) : 

-Eeplicated  architecture  (Figure  2.3)  with 
selection  cf  nodes  where  each  copy  will  reside. 

-Distributed  architecture  (Figure  2.4)  with  the 
use  of  twc  replicated  copies  of  the  data 
directory  located  at  each  TCP. 

e .  —  A  E  ML  processor  should  be  used  to  interface  between 
data  dictionary  and  session  services. 


APPENCIX  A 

T ABEEM  DATA  EICTIONABI 


1 .  Cverview 

This  appendix  is  included  to  mention  some  features 
(hopefully  the  most  important)  of  TANDEM  data  dictionary, 
since  tie  TANDEt  DBMS  will  he  used  in  the  SPLICE  system. 
For  a  mere  detailed  description  of  the  TANDEM  D/D,  see 
£Eef .  13:. 

A  data  definition  language  (DDL)  is  a  language  used 
ty  the  data  dictionary  administrator  to  describe  record  and 
file  structures  of  a  database.  After  the  description,  the 
resulting  source  file  is  input  to  the  DDL  compiler,  arc  the 
DDL  compiler  can  create  data  declaration  source  language  for 
catalase  records  in  three  languages,  COBOL,  FOfiTRAN,  and 
TAL.  The  DDL  compiler  can  also  produc.e  FUP  (file  utility 
program)  file  creation  commands  for  database  files.  The 
most  significant  feature  of  DDL  is  its  ability  to  create  and 
maintain  a  data  dictionary.  The  TANDEM  data  dictionary  is  a 
set  of  seven  files  that  documents  the  structure  and  location 
of  each  file  in  a  database. 

The  DDL  provides  facilities  for  updating  a 
dictionary  as  the  database  it  describes  grows  and  the  struc¬ 
ture  cf  the  database  files  changes.  The  DDL  compiler  and 
the  dictionary  it  creates  serve  as  a  central  pcirt  of 
contrcl  ever  a  database. 

TANDEM  defines  a  database  as  a  collection  cf  files 
structured  to  serve  cr.e  or  mere  applications.  When  a  list 
of  DEI  statements  --a  DDL  source  schema--  is  given  to  the 
DDL  compiler,  the  compiler  can  produce  any  of  the  following 
files : 


♦  A  data  dictionary. 


*  A  FOE  file  creation  command  source. 

*  A  data  declaration  source  for  COBOL, 

FCFIEAN,  or  T  A I . 

*  A  schema  report  summarising  each  record's 
structure  and  each  file's  access  keys. 

The  data  dictionary  produced  by  tae  DDL  compiler  is 
a  set  of  files  that  forms  a  permanent  record  of  the  database 
schema.  Thus  the  database  schema, stored  as  a  set  of 
dictionary  files,  becomes  a  system  resource.  The  dictionary 
gives  database  managers  information  about  each  file  in  the 
catalase  and  also  shews  how  the  files  relate  to  each  ether. 
After  tie  dictionary  has  been  created,  the  DEL  compiler  can 
read  the  dictionary  ard  produce  COBOL,  FOBTRAN,  or  TAL  data 
declaration  source  for  any  record  defined  by  the  schema. 
The  dictionary  is  also  used  ly  ENFORM,  TANDEM'S  database 
guery  language  and  report  writer. 

2.  Creating  a  Dictionary 

The  data  dictionary  files  can  be  created  cn  any 
subvclume  in  the  system.  The  subvolume  that  is  tc  certain 
the  data  dictionary  is  specified  with  the  DDL  DICI  command 
(for  example  ?DICT  3FTCCKNC.QNTY  ) .  The  DDL  compiler  first 
creates  the  dictionary  files  cn  the  quantity  sutvclume  of 
the  $  SICCKNO  volume,  and  then  opens  the  files  for  access. 

3 .  Dictionary  Retorts 

1ANIEM  provides  DDL  users  with  ENFCBM  source  for 
twelve  dictionary  reports.  The  twelve  reports  document  all 
cf  the  DEFINITION  and  RECCED  entries  in  the  dicticrary, 
describing  cot  only  their  structures,  but  how  they  relate  to 
each  ether  as  well. 

Cnee  a  schema  describing  a  database  has  teen 
compiled  ty  the  DDL  compiler  and  a  dictionary  has  been 
produced,  information  about  the  database  can  easily  be 
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obtained  with  a  set  ct  I AND  EM  provided  ENFCRM  queries.  lhe 
reports  produced  by  these  queries  provide: 

*  Database  documentation . 

*  Database  analysis  information. 

*  Quick  access  tc  dictionary  contents. 

lhe  dictionary  reports  are  produced  from  ENFCRM 
source  that  is  available  tc  the  user.  Tnis  means  that  in 
addition  tc  the  standard  reports,  you  can  obtain  custcuised 
reports,  tailored  tc  answer  specific  questions,  by  simrly 
editing  the  TANDEM  supplied  ENFORM  source.  The  ENFCRM 
dictionary  report  scurce  file  consists  of  12  queries  that 
produce  12  different  reports.  Each  guery  is  a  separate 
section.  lhus  the  gueries  can  be  run  as  a  complete  group, 
individually,  or  ir  any  combination.  The  12  dictionary 
reports  are  shown  in  lable  X. 

4 .  Updating  the  lictio nary 

As  the  database  changes,  its  dictionary  car  re 
updated  to  reflect  the  changes  by  adding,  deleting  cr  nodi- 
fying  DEFINITION  and  RECORD  entries.  In  Table  XI  is  a 
summary  cf  TANDEM  dictionary  modification  function. 
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TAEIE  I 

Dictionary  Report  Summary 


hat  Re  port  descri  c ticn 

E  1  DICTIONARY  03CECTS-  R1  descnces  each  ££f  ana 

RECORD  in  the  dictionary,  giving  the  tiae  aid 
date  of  creation,  the  time  and  date  of  the 
last  modification,  and  tne  version  number  fcr 
each  object. 

R 2  DEFINITION  STRUCTURE-  B2  lists  all  cf  the 

component  groups  and  fields  for  eacn  DIF  in 
the  dictionary. 

R 2  RECORD  STRUCTURE-  R3  lists  all  of  the 

component  groups  and  fields  for  each  RECORD 
in  the  dictiorary. 

R  4  DEFINITIONS  USING  DEFINITIONS-  R4  shows 

which  EEFs  are  referenced  by  other  DERs. 

The  referencing  EEFs  are  listed  with  eacn 
of  its  elements  that  references  another 
DEF  and  the  referenced  DEF's  name. 

R  5  RECORDS  USING  DEFINITIONS-  R5  snows  which 

DEFs  are  referenced  by  RECORDS.  Each  RECORD 
is  listed  with  each  or  its  elements  that 
references  a  DEF  and  the  referenced  DEF's 
name. 

E  6  DEFINITIONS  WHERE  USED-  R6  lists  each  EEF 

that  is  referenced  by  another  object,  he  it 
a  DEF  cr  a  RECORD.  Tne  referencing  DrF  cr 
RECORD  is  shown  in  each  case. 

El  RECORD  ACCESS-  B7  lists  the  rile  naae  and 

access  Aeys  l hcth  primary  and  alternate)  for 
each  FFCCRD  in  the  dictionary. 

E  €  RECORD  DEFINITION  METHOD-  R3  shows  the  aethcd 

used  tc  define  each  RECORD.  The  source  DEF 
is  listed  for  these  RECORDS  defined  with  tne 
DEF  IS  <def  name)  clause. 

ES  REPORT  HEADINGS-  E9  lists  all  or  the  ENFCFM 

report  headings  declared  for  fieds  and 
groups  kithin  each  DEF  and  RECORD  in  the 
diet  ion sry . 

RIO  DISPLAY  FORMATS-  RIO  lists  ail  cf  the  ENFCEM 

display  formats  declared  for  fields  and 
groups  within  eacn  CLP  and  RECORD  in  the 
diet icnary. 

R11  RECORD  COMMENTS-  R11  lists  the  comments  that 

immediately  preceded  the  defining  RECCED 
statement  for  each  RECORD  in  tne  dictionary. 

E 12  DEFINITION  COMMENTS-  R12  lists  the  comments 

that  immediately  preceded  the  definirg  DEF 
statemert  for  each  DEF  in  the  dictionary. 
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T  AEL  £  XI 

Dictionary  Modification  Function 


.  t_ype 

ADE/EEE 
AID/f ECCRD 
DEIETE/DEF 

r  EIE1E/EEC0EE 
KCD1EY/EEF 


MCEIEY/EECORE 
(fcitt  nc 
EEf  changes) 

CCD1F Y/EECORD 
(kith  EEF 
chances) 


Rrcce  aurt 


Open  dictionary  with  EDICT  ar.d 
compile  new  DEi  statement. 


Cpen  dictionary  with  EDICT  aEd 
compile  new  RECORD  statemen  t 


Ccen  dictionary  witn  EDICT,  delete 
all  dictionary  entries  that 
reference  the  DEF,  and  then  delete 
the  DEF  itself  with  DELETE 


Cpen  dictionary  with  EDICT 

and  then  delete  the  RECCRD  entry 

with  the  DELETE  statement. 


Cpen  dictionary  with  EDICT 
command,  then  delete  all  other 
RECORD  and  DEi  entries  that  refe¬ 
rence  the  DEF,  delete  the  DEE, 
recompile  the  edited  DEF,  and, 
finally,  recompile  the  EnF  and 
RECORD  statements  that 
reference  the  DEF. 


Cpen  dictionary  with  EDICT 
and  recompile  edited 
RECORD  statement. 


Cpen  dictionary  with  EDICT 
and  delete  the  RECORD  with 
the  DELETE  statement.  Then 
modify  any  DEF  entries  that  need 
to  be  changed,  and  finally, 
recompile  the  new  record  statement 
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